Closed dineshkumarsrk closed 3 years ago
The densification is needed when not all of the sketch bins have observations – you'll observe this when the genome length is on a similar order of magnitude to the sketch size. The algorithm is described here: https://dl.acm.org/doi/10.5555/3305890.3306007
It's not a problem, and expected with viral genomes.
With the plots I would recommend:
--plot-fit 10
to your commandClosing due to no updates
PopPUNK 2.0.2 poppunk_sketch 1.7.0 I tried to create database for viral genomes by following command
poppunk --create-db --threads 10 --output database20 --r-files list.txt --max-a-dist 1 --min-k 16 --sketch-size 10000
While running it shows unusual warning as shown below,NOTE: NTP898 required densification
Even-though It createddatabase20
withoutline fit plot
and other files. I never faced this issue while working on bacterial genomes. I tried to google aboutrequired densification
but end up with nothing. Kindly let me know details regarding above mentioned warning and also help me to getline-fit-plot