JonasMendez / SORTER2

Sorter of Orthologous Regions for Target Enrichment Reads v2 Toolkit
1 stars 0 forks source link

KeyError in SORTER2_Stage2_PhaseOrthologs.py #1

Open kroeve opened 5 days ago

kroeve commented 5 days ago

Hi @JonasMendez !

I'm trying to set up SORTER2 and encounter an issue when executing the script SORTER2_Stage2_PhaseOrthologs.py. The script exits with following error:

Traceback (most recent call last):
  File "SORTER2_Stage2_PhaseOrthologs.py", line 603, in <module>
    writer.writerow([ind]+[HETDICT[ind][stat] for stat in stats])
  File "SORTER2_Stage2_PhaseOrthologs.py", line 603, in <listcomp>
    writer.writerow([ind]+[HETDICT[ind][stat] for stat in stats])
KeyError: 'in'

Do you have any idea what might cause the problem? I installed all dependencies listed and mostly used the recommended version, although due to some compatibility issues in conda I sometimes had to install different versions. Please let me know if you need further information for solving this issue!

Cheers, Evelin

JonasMendez commented 4 days ago

Thanks for posting this error. This seems to be an issue with the read statistics generated for samples when mapping reads for phasing. First I would check the 'readstats.txt' files in the assembly folders to make sure they have output statistics from SAMTools since this part of the script uses data from those files to write a summary csv file. Could you email me (contact@endemicbio.info) an example of one of these readstats.txt files and the python and software versions you are using? I have also provided a .yml anaconda file for LINUX systems on the repository you can use to rebuild the conda environment I have used to run the pipeline. It contains all software and dependencies (except for USEARCH, this needs to be installed separately and the executable renamed to 'usearch') that I have used to successfully run the pipeline. If you email me I can help trouble shoot this in more detail and I can post the solution here once we have resolved it.

JonasMendez commented 4 days ago

Seems like this issue was an oversight on my part in how samtools outputs bam read statistics between v1.10 and v1.21; I have updated the scripts and documentation to require samtools v1.21. There may be additional issues, so I will leave the issue open for the time being until I confirm this has resolved the error. Feel free to post here if you are having the same problem with samtools v1.21, Thank you