zjshi / Maast

Microbial agile accurate SNP Typer
MIT License
24 stars 2 forks source link

IndexError: list index out of range #16

Closed jamesPet closed 1 year ago

jamesPet commented 1 year ago

Running into an index out of range issue

$ maast end_to_end --min-prev 0.9 --out-dir test_out --in-dir a_few_asms/
[Warning] Total number of genomes (9) < min. number of genomes required for effective SNP calling with MAF 0.01 (100)
[Warning] Skip tag genome selection, all genomes will be used
reference genome path: a_few_asms/DRR090820_contigs_skesa.fasta
[building mash sketch]: start
[calculating mash distance]: start
[clustering] start
[clustering] done
a_few_asms/DRR090793_contigs_skesa.fasta
Running mummer4; start
reference genome path: a_few_asms/DRR090793_contigs_skesa.fasta
[paired alignment]: start
[paired alignment]: done
        DRR090793_contigs_skesa.fasta - DRR090809_contigs_skesa.fasta
        DRR090793_contigs_skesa.fasta - DRR090807_contigs_skesa.fasta
        DRR090793_contigs_skesa.fasta - DRR090793_contigs_skesa.fasta
        DRR090793_contigs_skesa.fasta - DRR090820_contigs_skesa.fasta
        DRR090793_contigs_skesa.fasta - DRR090805_contigs_skesa.fasta
        DRR090793_contigs_skesa.fasta - DRR090795_contigs_skesa.fasta
        DRR090793_contigs_skesa.fasta - DRR090797_contigs_skesa.fasta
        DRR090793_contigs_skesa.fasta - DRR090801_contigs_skesa.fasta
        DRR090793_contigs_skesa.fasta - DRR090810_contigs_skesa.fasta
Reading reference genome
   count contigs: 43
   count sites: 4529549
Initializing alignments
   count genomes: 0
Reading alignment blocks
Reading SNPs
Writing fasta
   path: test_out/temp/mummer4/a_few_asms/msa.fa

Done!
Time (s): 1.15
Running mummer4; done!
Elapsed time: 12.280270099639893
Fetching file-type-specific parser; start
Traceback (most recent call last):
  File "/home/pj/.conda/envs/maast/bin/bin/maast.py", line 1371, in <module>
    main()
  File "/home/pj/.conda/envs/maast/bin/bin/maast.py", line 1366, in main
    end2end_main(args)
  File "/home/pj/.conda/envs/maast/bin/bin/maast.py", line 1310, in end2end_main
    call_snps_main(args)
  File "/home/pj/.conda/envs/maast/bin/bin/maast.py", line 1186, in call_snps_main
    site_assembly = msa.monolithic_parse(args['msa_path'], args['msa_type'], args['max_samples'])
  File "/home/pj/.conda/envs/maast/bin/align_io/msa.py", line 17, in monolithic_parse
    return parse_control(msa_path, msa_type, max_sample)
  File "/home/pj/.conda/envs/maast/bin/align_io/msa.py", line 14, in parse_control
    return parse(msa_path, max_sample)
  File "/home/pj/.conda/envs/maast/bin/align_io/xmfa_mummer4_io.py", line 30, in parse
    cur_aln.ncols = len(cur_aln.seqs[0].seq)
IndexError: list index out of range
zjshi commented 1 year ago

Hi James, thanks for reporting the issue. The cause could be failure or unexpected outcome stemmed from individual alignment. Is there way that you could share these input sequences? I would like to start with reproducing the issue on my end.

jamesPet commented 1 year ago

You bet. See attached for the assemblies and thanks for taking a look at the issue. a_few_asms.zip

zjshi commented 1 year ago

Thanks for providing the sample assemblies. With those files, I have identified and fixed a bug causing this issues. It turns out to be a couple of places in the scripts did not handle assembly files with a ".fasta" suffix in the way they should have. I've pushed a patch to this repository as well as uploaded a new conda release. I will tentatively close this issue for now. Please feel free to reopen it if you still experience the similar problem.