zjshi / Maast

Microbial agile accurate SNP Typer
MIT License
24 stars 2 forks source link

Error on Database Building #17

Closed jamesPet closed 1 year ago

jamesPet commented 1 year ago

Working with the same 10 assemblies as in issue #16 and got the following error:

$  maast end_to_end --min-prev 0.9 --out-dir test_out --in-dir a_few_asms/
[Warning] Total number of genomes (9) < min. number of genomes required for effective SNP calling with MAF 0.01 (100)
[Warning] Skip tag genome selection, all genomes will be used
reference genome path: a_few_asms/DRR090820_contigs_skesa.fasta
[building mash sketch]: start
[calculating mash distance]: start
[clustering] start
[clustering] done
a_few_asms/DRR090793_contigs_skesa.fasta
Running mummer4; start
reference genome path: a_few_asms/DRR090793_contigs_skesa.fasta
[paired alignment]: start
[paired alignment]: done
        DRR090793_contigs_skesa - DRR090809_contigs_skesa
        DRR090793_contigs_skesa - DRR090807_contigs_skesa
        DRR090793_contigs_skesa - DRR090793_contigs_skesa
        DRR090793_contigs_skesa - DRR090820_contigs_skesa
        DRR090793_contigs_skesa - DRR090805_contigs_skesa
        DRR090793_contigs_skesa - DRR090795_contigs_skesa
        DRR090793_contigs_skesa - DRR090797_contigs_skesa
        DRR090793_contigs_skesa - DRR090801_contigs_skesa
        DRR090793_contigs_skesa - DRR090810_contigs_skesa
Reading reference genome
   count contigs: 43
   count sites: 4529549
Initializing alignments
   count genomes: 9
Reading alignment blocks
Reading SNPs
Writing fasta
   path: test_out/temp/mummer4/a_few_asms/msa.fa

Done!
Time (s): 14.14
Running mummer4; done!
Elapsed time: 21.279797554016113
Fetching file-type-specific parser; start
total length of alignments: 4529549
Fetching file-type-specific parser; done
Elapsed time: 5.482044696807861
Identifying core-snps; start
max sites: inf
min prevalence: 0.9
min MAF: 0.01
total number of sites: 4529549
min. prevalence: 0.9
min. alt. frequency: 0.01
masked by prev_mask: 4397715
masked by snp_mask: 838
masked by wildcard_mask: 4529549
Identifying core-snps; done
Elapsed time: 2.1907529830932617
Writing snps to VCF; start
Writing snps to VCF; done!
Elapsed time: 0.04240226745605469
Database building; start
[load] loading core-genome consensus sequence from a_few_asms/DRR090793_contigs_skesa.fasta
        the loaded core-genome has a consensus sequence of 4529549 bases

[load] loading key coordinates on core-genome from test_out/coords.tsv
        a total of 116 divisions was found

[load] loading core snps from test_out/core_snps.vcf
        a total of 789 core snps was found

[searching] start to search 31-mers
        a total of 23764 kmer records was found

[validating kmer set]: start

Error: the following returned non-zero status: 'callm_db_val -d test_out/nr_kmer_set.tsv -n 100000 -t 20 -L test_out/check_fna_paths.list -o test_out/kmer_prof.tsv ':

b"callm_db_val\ttest_out/nr_kmer_set.tsv\t20\nprogram reads a list of kmer pools for checking kmer uniqueness: test_out/check_fna_paths.list\nDB loading OK!\nterminate called recursively\nterminate called after throwing an instance of 'std::invalid_argument'\n"
zjshi commented 1 year ago

Similar bug, which should be fixed now. Please reopen this thread if the issue is still there. Thanks again for using Maast and reporting bug!