Closed JLC2141 closed 9 months ago
Here's some additional information: pp-sketchlib v2.1.1
Installations: Poppunk Install Conda create --name poppunk conda activate poppunk python3 -mpip install poppunk
pp-sketchlib Install sudo apt install cmake gfortran libarmadillo-dev libeigen3-dev libopenblas-dev pip3 install pp-sketchlib
Have you tried just omitting the output of the query step:
sketchlib sketch -l files.txt -o database -s 1000 -k 15,30,3 --cpus 40
sketchlib query jaccard database --cpus 40 > distances.tab
Thank you
popunk version: 2.6.0
I am attempting to re-create the poppunk_sketch jaccard distance table as shown in this previous issue: https://github.com/bacpop/PopPUNK/issues/167#issuecomment-843873526
However, I am unable to use poppunk_sketch in my current version of poppunk. My current workaround is as follows:
sketchlib sketch -l files.txt -o database -s 1000 -k 15,30,3 --cpus 40 sketchlib query jaccard database -o dists --cpus 40 poppunk_extract_distances.py --distances dists --output distances.tab
Where the output from poppunk_extract_distances.py in the "Core" and "Accessory" columns appears to be the jaccard distances for the first two kmers of kseq specified in the "sketchlib sketch" function.
Is there a simpler approach to output a table of jaccard distances per kmer?