Open ctb opened 5 years ago
% python cocluster.py --first podar-ref/63.fa.sig --second podar-ref/63.fa.sig podar-ref/2.fa.sig -k 31 --cut-point=1.0
first list contains 1 files; second list contains 2 files.
... loading file 0 of 1 for first list
... loading file 1 of 2 for second list
ksize: 31 / moltype: DNA
downsampling to scaled value of 1000
first list contains 1 signatures; second list contains 2 signatures.
...comparing 3 signatures, all by all
0-NC_011663.1 She... [1. 1. 0.]
1-NC_011663.1 She... [1. 1. 0.]
2-CP001071.1 Akke... [0. 0. 1.]
min similarity in matrix: 0.000
** wrote coclust dendrogram to sourmash.coclust.dendro.pdf
cluster 2 is 1 in size
CP001071.1 Akkermansia muciniphila ATCC BAA-835, complete genome
cluster 1 is 2 in size
NC_011663.1 Shewanella baltica OS223, complete genome
NC_011663.1 Shewanella baltica OS223, complete genome
** wrote coclust assignments spreadsheet to sourmash.coclust.csv
see also #1265, uniqify script, which I think is nice and simple.
may be good as a plugin test #1353
the cocluster script may be useful for people comparing the output of binning.
see also #459