Closed dwilkin799 closed 5 years ago
Sorry for the double-post. It occurred to me that the problem might also occur for classification by the 7-gene MLST profiles. And indeed this is the case, which rules-out it being an issue with the formatting of my cgMLST files...
I re-downloaded one of the offending genomes from NCBI: GCF_000007685.1_ASM768v1_genomic.fna (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/007/685/GCF_000007685.1_ASM768v1)
tried to type it using "--scheme leptospira" and it returns all "-" When I look for the genes manually, they are there...
thanks again, David
If I use the default mode, it works:
$ mlst --version
mlst 2.16.2
$ mlst --quiet GCF_000007685.1_ASM768v1_genomic.fna.gz
GCF_000007685.1_ASM768v1_genomic.fna.gz leptospira_3 2 adk_3(1) icdA_3(1) lipL32_3(2) lipL41_3(2) rrs2_3(1) secY_3(1)
If i use the --scheme
mode, it also works:
mlst --scheme leptospira --quiet GCF_000007685.1_ASM768v1_genomic.fna.gz
GCF_000007685.1_ASM768v1_genomic.fna.gz leptospira 17 glmU_1(1) pntA_1(1) sucA_1(2) tpiA_1(2) pfkB_1(10) mreA_1(4) caiB_1(8)
It sounds like it has not been installed properly, or is an old version.
Following the last couple of updates of mlst, and your kind response to cgMLST? #67... I have been using mlst to do cgMLST assignments for a number of schemes.
It works very well in most cases... but I noticed recently that I randomly come across genomes where mlst returns lots of missing genes. It's very odd, because it will return hits for some genes, but I'll get 90% "-" assignments for most loci.
Obviously, I double check that the genes are actually there in another piece of software (Geneious). And I find exact matches to existing alleles.
The problem is reproducible, doesn't seem to have anything to do with line endings or character misuse in the genome assembly file (I remade the genome fasta files to check this)... and I cannot pin down why it is happening.
Any ideas? Many thanks, David