Closed furqan915 closed 3 years ago
As I understand the output you have outliers with a>=0.9.
I set --max-a-dist 1
and it worked.
Are these bacteria from different species? That may be why you are getting such high accessory distances, which can make later parts of the clustering process fail.
If you want to proceed anyway, you can do as @SilasK says and change --max-a-dist
to 1, or --qc-filter continue
to ignore all errors.
For understanding: a=0.9 means that 90% of the genome is classified as an accessory and the core represents only 10%?
I tried with --qc-filter continue
only and it raises the same error. now with v2.2
My mistake, qc-filter only looks at the individual genomes and their sketches, not the distances.
a = 0.9 means that 90% of the accessory sequence (changes larger than the smallest k-mer size) is different, but it doesn't tell you about the proportion of core to accessory. Decreasing the lowest k-mer size can therefore give extra resolution on the accessory distance (we still need to update the docs on k-mer length choice, sorry this isn't all out there).
Hi, Thanks for such quick response. No. these genomes are strains of the same species. I downloaded these complete genomes from NCBI to compare the various parameters.
I have tried --max-a-dist 1
but it could not resolve the issue. Any other solution you can suggest?
What output did you get when using --max-a-dist 1
? Can you post the full command you used too?
Closing as --max-a-dist 1
should fix this, but please reopen (with error message) if problems still arise
Hi, I am using POPUNK to study 27 genomes of a bacteria downloaded from NCBI. But whenever I run this command,
root@hon-pc:/home/fuan/monas/fna_combine# poppunk --easy-run --r-files reference_list.txt --output lm_example --threads 4 --plot-fit 5 --min-k 17 --k-step 2 --max-a-dist 0.8 --full-db
I have already tried the default --mink 13 and --kstep 4.
I always get this error.
Please help me with this issue.