Closed Greblica closed 5 years ago
OK, I realized this does exactly what it should according to me :). I was just confused by the use of words "target" and "query", if target is cluster member and query cluster representative, it makes perfect sense.
Yes, this exactly the definition. I should write this more clearly in the documentation.
I have a question regarding the choice of linclust parameters in the Steinegger et al. 2019 paper.
These are the parameters:
--kmer-per-seq 80 --cluster-mode 2 --cov-mode 1 -c 0.9 --min-seq-id from 50% to 90%
My question concerns the part in bold:
if I understand well, it means that the query must cover at least 90% of the target to be listed in the cluster, is that so?
If that is the case, could you explain to me the rationale behind this? Somehow, to me seems more intuitive that it is the query which should kind of fit into the target. Even if it's much shorter, but still similar, I would expect it to be in the same cluster (or it shouldn't be?).
Thanks a lot for your clarification,
G