PMBio / deeprvat

Other
31 stars 2 forks source link

Str to int conversion for chrX & chrY #58

Closed Jonas-B-Frank closed 6 months ago

Jonas-B-Frank commented 6 months ago

I am running the most recent version of deepRVAT on a Slurm based HPC system. Snakefiles have been adapted accordingly to include ressources and partition. Data are from a WGS cohort, split over chromsomes to increase speed. I ran the preprocessing pipeline and it worked just fine (I excluded HWE qc) and am currently running the annotation pipeline. In rule concat_deepSea I get the following error:

Traceback (most recent call last): File "Path/deeprvat/annotations/annotations.py", line 1279, in <module> cli() File "Path/envs/deeprvat_annotations/lib/python3.9/site-packages/click/core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "Path/envs/deeprvat_annotations/lib/python3.9/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "Path/envs/deeprvat_annotations/lib/python3.9/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "Path/envs/deeprvat_annotations/lib/python3.9/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "Path/envs/deeprvat_annotations/lib/python3.9/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "Path/deeprvat/annotations/annotations.py", line 1005, in concatenate_deepripe included_chromosomes = [int(c) for c in included_chromosomes.split(",")] File "Path/deeprvat/annotations/annotations.py", line 1005, in <listcomp> included_chromosomes = [int(c) for c in included_chromosomes.split(",")] ValueError: invalid literal for int() with base 10: 'X'

The conversion of X and Y to an integer fails in annotations.py, line 1005 (and probably later on as well) . I don't want to break subsequent steps, so would greatly appreciate your insights.

Marcel-Mueck commented 6 months ago

Hey Jonas, thank you for submitting the issue. I could reconstruct the error you got after including the sex chromosomes (These were excluded in the paper analysis). I changed the pipeline and some methods s.t. they work with chars instead of int values (in fact, I got rid of the reliance on a pvcf file altogether). I will add the changes to PR #54 . This PR will be merged into main soon, if you want to avoid the error already I could recommand switching to the branch annotations-new-features.

Jonas-B-Frank commented 6 months ago

Hey Marcel, thank you for your quick responses on both issues. I will wait for the merge into the main branch and will then try to run the pipeline again. Best, Jonas

Marcel-Mueck commented 5 months ago

Hey, just letting you know that this issue, too, has been corrected in the main branch of deeprvat now.

Regards, Marcel Mück