DessimozLab / read2tree

a tool for inferring species tree from sequencing reads
MIT License
142 stars 18 forks source link

IndexError: list assignment index out of range in Mapper.py #46

Open bernard-kim opened 11 months ago

bernard-kim commented 11 months ago

It looks like something weird is happening with Mapper.py. I'm running read2tree through conda with Python=3.10.8, following the installation instructions in the documentation.

I think if bases in line 513 is not properly checking empty lists, or something funky is happening - this occurs for some datasets and not others.

                if bases:
                    seq[pileup_column.pos] = self._most_common(bases)

Resulting in the error message:

Loading alignments : 0 Alignment [00:00, ? Alignment/s]
--- Mapping of reads to reference sequences ---
--- Splitting reads from genomes/4_D_D6_S293_L004.fq ---
2023-10-16 16:11:18,904 - read2tree.Reads - INFO - 4DD6S293L004: --- Splitting reads from genomes/4_D_D6_S293_L004.fq ---
Splitting reads: 1693283 reads [00:03, 464240.82 reads/s]
2023-10-16 16:11:22,552 - read2tree.Reads - INFO - 4DD6S293L004: Reads larger than 20000 were split into 20000 bp long fragments with an overlap of 1000 bp.
2023-10-16 16:11:22,553 - read2tree.Reads - INFO - 4DD6S293L004: 1693283 reads were split into 1693283 reads.
2023-10-16 16:11:22,553 - read2tree.Reads - INFO - 4DD6S293L004: Splitting of reads took 3.6483232975006104.
Mapping reads to species:   0%|                                                                                     | 0/13 [00:00<?, ? species/s]2023-10-16 16:11:22,553 - read2tree.Mapper - INFO - 4DD6S293L004: --- Mapping of reads to DROBP reference species ---
2023-10-16 16:11:55,262 - read2tree.Mapper - INFO - 4DD6S293L004: Mapped 306991.0 / 418262.0 reads to DROBP_OGs.fa
2023-10-16 16:11:55,315 - read2tree.Mapper - INFO - 4DD6S293L004: Mapping to DROBP_OGs.fa references took 32.75954222679138.
Mapping reads to species:   0%|                                                                                     | 0/13 [00:47<?, ? species/s]
Traceback (most recent call last):
  File "/home/groups/dpetrov/bernard/miniconda3/envs/r2t/bin/read2tree", line 16, in <module>
    main(sys.argv[1:], exe_name=exe_name(), desc=desc)
  File "/home/groups/dpetrov/bernard/miniconda3/envs/r2t/lib/python3.10/site-packages/read2tree/main.py", line 357, in main
    mapper = Mapper(args, og_set=ogset.ogs, ref_set=reference.ref, progress=progress)  # Run the mapping
  File "/home/groups/dpetrov/bernard/miniconda3/envs/r2t/lib/python3.10/site-packages/read2tree/Mapper.py", line 80, in __init__
    self._map_reads_to_references(ref_set)
  File "/home/groups/dpetrov/bernard/miniconda3/envs/r2t/lib/python3.10/site-packages/read2tree/Mapper.py", line 289, in _map_reads_to_references
    processed_reads = self._call_wrapper(ref_tmp_file_handle, reads,
  File "/home/groups/dpetrov/bernard/miniconda3/envs/r2t/lib/python3.10/site-packages/read2tree/Mapper.py", line 138, in _call_wrapper
    return self._post_process_read_mapping(ref_file_handle, bam_file)
  File "/home/groups/dpetrov/bernard/miniconda3/envs/r2t/lib/python3.10/site-packages/read2tree/Mapper.py", line 570, in _post_process_read_mapping
    consensus = self._build_consensus_seq_v2(ref_file, outfile_name +
  File "/home/groups/dpetrov/bernard/miniconda3/envs/r2t/lib/python3.10/site-packages/read2tree/Mapper.py", line 514, in _build_consensus_seq_v2
    seq[pileup_column.pos-1] = self._most_common(bases)
IndexError: list assignment index out of range
sinamajidian commented 11 months ago

Thanks @bernard-kim for reporting this.

It would be much easier to find the issue if you could share part of the data with us. This is my email address sina.majidian at gmail. The accession id would be enough if it is a public dataset. Please also tell us which clade you used to generate the marker genes or share with us the gene marker folder.

Thanks!