broadinstitute / StrainGE

strain-level analysis tools
BSD 3-Clause "New" or "Revised" License
32 stars 9 forks source link

Running StrainGR call with refs_concat.fasta results TypeError: 'NoneType' object is not subscriptable #40

Open sjmiller-sahmri opened 3 weeks ago

sjmiller-sahmri commented 3 weeks ago

Going through the StrainGR pipeline as per the tutorial - seemed to successfully use prepare-ref to create refs_concat.fasta (which looks ok), but found that I needed to use bwa index on the refs_concat.fasta for bwa mem to work (wouldn't read refs_concat.fasta), creating prefix -P [strain]. Thought I would be able to use this workaround for StrainGR call but unfortunately I can't use -P [strain] as this is reserved for -P MIN_PILEUP_QUAL. Below is the error when running StrainGR call on one of my samples:

Running on host hpc-node020 Time is Mon Aug 26 15:41:49 ACST 2024 Processing 0305028BBOP 2024-08-26 15:41:50,754 - INFO:root:Loading reference refs_concat.fasta... 2024-08-26 15:41:50,779 - INFO:root:Reference length: 5423038 2024-08-26 15:41:50,780 - INFO:root:Start analyzing aligned reads... Traceback (most recent call last): File "/home/mill0871/.conda/envs/strainge/bin/straingr", line 8, in sys.exit(straingr_cli()) File "/home/mill0871/.conda/envs/strainge/lib/python3.9/site-packages/strainge/cli/main.py", line 110, in call self.run(args) File "/home/mill0871/.conda/envs/strainge/lib/python3.9/site-packages/strainge/cli/registry.py", line 83, in run rc = subcommand_func(**args_dict) File "/home/mill0871/.conda/envs/strainge/lib/python3.9/site-packages/strainge/cli/straingr.py", line 468, in call call_data = caller.process(reference, sample_bam) File "/home/mill0871/.conda/envs/strainge/lib/python3.9/site-packages/strainge/variant_caller.py", line 992, in process self._assess_allele(call_data, scaffold, refpos, read) File "/home/mill0871/.conda/envs/strainge/lib/python3.9/site-packages/strainge/variant_caller.py", line 1078, in _assess_allele qual = alignment.query_qualities[pos] TypeError: 'NoneType' object is not subscriptable 0305028BBOP processing completed.

Could somebody please tell me what I've done wrong? I'm very much a novice with all this so any guidance is appreciated!! An example of what my refs_concat.fasta file looks like is below:

NZ_CP077805.1 Klebsiella variicola strain FF907 chromosome, complete genome CCTTGTCCGGGCTACAGGTCGTGCAGGGCGGGTGCGAATCCGTAGCCCCGGTAAGCGCAGCGCCACCGGGGAGACATACCGGCACGGTTGCCGGACCACTGAAACGCAAAAAGCCCATCCGGCAGGATGGGC ...(and so on for 386 lines/14092 columns)