DaehwanKimLab / hisat-genotype

GNU General Public License v3.0
23 stars 15 forks source link

Error when converting .report into csv (hisatgenotype_toolkit) #40

Open Tijs-dot opened 3 years ago

Tijs-dot commented 3 years ago

Hi Chris, I have tried using HISAT-genotype with some fastq files and it runs fine. Now I tried to use the HISAT-genotype toolkit to convert my output into csv format. This gave following error:

$ hisatgenotype_toolkit parse-results -t 2 --csv --in-dir hisatgenotyperesultP2B1L8allele, , percent = line.split()^C (hsgtenv) [twatzeels@riro app]$ cd hisatgenotype (hsgtenv) [twatzeels@riro hisatgenotype]$ hisatgenotype_toolkit parse-results -t 2 --csv --in-dir hisatgenotype_resultP2B1L8 Traceback (most recent call last): File "/home/MOLGEN/twatzeels/app/hisatgenotype/hisatgenotype_tools/hisatgenotype_parse_results.py", line 160, in result_process(args) File "/home/MOLGEN/twatzeels/app/hisatgenotype/hisatgenotype_tools/hisatgenotype_parse_results.py", line 69, in result_process report_results[report] = typing_common.call_nuance_results(report) File "/home/MOLGEN/twatzeels/app/hisatgenotype/hisatgenotype_modules/hisatgenotype_typing_common.py", line 2021, in call_nuanceresults allele, , percent = line.split() ValueError: too many values to unpack (expected 3) Script exited with error 1

I also tried this command using the files provided by the HISAT tutorial, and this worked fine. I think there is something wrong with the report files I generated with HISAT-genotype. Can you help me with this?

Kind regards, Tijs

chbe-helix commented 3 years ago

Hi Tijs,

Absolutely! I think the easiest way to find the offending file is to: 1) Locate the hisatgenotype_typing_common.py file in the hisatgenotype_modules/ folder of HISAT-genotype (where ever you installed it 2) Open the hisatgenotype_typingcommon.py file in the editor of your choice 3) Go to line 2021 at the bottom of the file. This is the code that is producing the error 4) Right above the `allele, , percent = line.split()` code add the following:

print(line, nfile)

5) Save the file 6) Rerun the script

This will produce a lot of lines but it will show you the formatting error and the file where the error is. Hope this helps!

Thanks, Chris

Tijs-dot commented 3 years ago

Hi Chris,

I tried doing what you explained above, but the error remained unchanged, so that meant it failed even before the first real output line. So I tried making some changes in the lines above, and it turns that the parameter --keep-low-abundancy-alleles caused the program to fail. If I change or remove the word "abundance" from this command line, the program runs fine!

Tijs

chbe-helix commented 3 years ago

Hi Tijs,

That's an unexpected issue. I'll add this to my list of bugs to examine to see if there is something I missed when designing the script. Thanks for the update!

Thanks, Chris