FOI-Bioinformatics / CanSNPer2

CanSNPer2: A toolkit for SNP-typing bacterial genomes.
GNU General Public License v3.0
6 stars 2 forks source link

No valid start SNP found #29

Closed CarolineOhrman closed 2 years ago

CarolineOhrman commented 2 years ago

This case finds snps but not a valid start. Returns NA as called snp and an EMPTY *_snps.txt file NE061598_GCA0000233051_not_called.txt NE061598_GCA0000233051_snps.txt

Used the francisella database downloaded form CanSNPer2-data. See genome attached NE061598_GCA0000233051.fa.gz

My command

CanSNPer2 --database francisella_tularensis.db fasta_all/NE061598_GCA0000233051.fa --summary --keep_temp --verbose --min_required_hits 2 --tmpdir tmp/

output

2022-08-10 10:35:29,629 CanSNPer2 [INFO ]  Running CanSNPer2 version-2.0.6
2022-08-10 10:35:29,629 DatabaseConnection [INFO ]  francisella_tularensis.db opened successfully...
2022-08-10 10:35:29,629 DatabaseConnection [INFO ]  cursor created
2022-08-10 10:35:29,629 DatabaseConnection [INFO ]  Load XMFAFunctions
Run 1 alignments to references using progressiveMauve
2022-08-10 10:35:29,629 CanSNPer2 [INFO ]  Running CanSNPer2 on NE061598_GCA0000233051.fa
2022-08-10 10:35:29,629 CanSNPer2 [INFO ]  Run mauve alignments
2022-08-10 10:35:29,629 CanSNPer2 [INFO ]  Starting progressiveMauve on 5 references
2022-08-10 10:35:42,113 CanSNPer2 [INFO ]  Alignments for fasta_all/NE061598_GCA0000233051.fa complete!
2022-08-10 10:35:42,113 CanSNPer2 [INFO ]  Find SNPs
2022-08-10 10:35:42,488 CanSNPer2 [INFO ]  Printing SNP info of non called SNPs to results/NE061598_GCA0000233051_not_called.txt
2022-08-10 10:35:42,490 DatabaseConnection [INFO ]  francisella_tularensis.db opened successfully...
2022-08-10 10:35:42,490 DatabaseConnection [INFO ]  cursor created
2022-08-10 10:35:42,490 DatabaseConnection [INFO ]  Load CanSNPdbFunctions
2022-08-10 10:35:42,493 NewickTree [WARNI]  #Node has no snp_annotation root
2022-08-10 10:35:42,508 NewickTree [INFO ]  Called snps: ['A.I.1', 'T/N.1', 'A.I.12', 'B.83']
2022-08-10 10:35:42,508 NewickTree [INFO ]  No valid start SNP found.
2022-08-10 10:35:42,508 CanSNPer2 [INFO ]  results/NE061598_GCA0000233051_tree.pdf
2022-08-10 10:35:42,508 CanSNPer2 [INFO ]  No valid start SNP found.
NE061598_GCA0000233051: NA
2022-08-10 10:35:42,511 DatabaseConnection [INFO ]  francisella_tularensis.db opened successfully...
2022-08-10 10:35:42,511 DatabaseConnection [INFO ]  cursor created
2022-08-10 10:35:42,511 DatabaseConnection [INFO ]  Load CanSNPdbFunctions
2022-08-10 10:35:42,512 NewickTree [WARNI]  #Node has no snp_annotation root
2022-08-10 10:35:44,253 CanSNPer2 [INFO ]  results/summary_tree.pdf
2022-08-10 10:35:44,253 CanSNPer2 [INFO ]  CanSNPer2 finished successfully, files can be found in results/
CarolineOhrman commented 2 years ago

After testing some more I have some trix that can help if someone experience this problem.

  1. Rerun the genome and set the --min_required_hits low (works with 0), then you will get output in your snp-file
  2. Add --save_tree to command and look in the pdf for problems in the alignment
  3. Redo the alignment with progressive mauve outside of CanSNPer2 (progressivemauve on desktop and linux works both). They need to be named {referencename}{sample}.xmfa, {referencename}{sample}.bbcols, {referencename}{sample}.backbone, and rerun the cansnper with --tmpdir /where_you_have_xmfa --skip_mauve settings. This worked for me for the genome above. Do not know why mauve failed when running it with CanSNPer.
CarolineOhrman commented 2 years ago

This was a mauve problem and not CanSNPer