Closed AroneyS closed 1 year ago
Few steps closer. Still need to add to cli. Also may be faster to let skani handle the fileio in bulk, like in https://github.com/bluenote-1577/skani-lib-example/blob/main/src/main.rs. But then we would need independent logic for skani/fastani, since it would calculate all pairs up-front. Could add it to the initialise method for skani and then reference a look-up table in calculate_ani...
Are you thinking per-disconnected component after the preclusterer? Or just in total? If the latter then no point in preclustering, I think.
Not sure. Both are possibilities, though skani doesn't recommend comparing genomes with <82% ANI, so we would have to deal with that if we skip preclustering, right? Though it says "If the resulting aligned fraction for the two genomes is < 15%, no output is given.", so maybe <82% just doesn't give an answer, rather than giving an unreliable answer.
Also, I get this warning on compile: warning: the following packages contain code that will be rejected by a future version of Rust: buf_redux v0.8.4, partitions v0.2.4
--cluster-method
isn't in the --full-help
. Is there somewhere that I missed? Or do I have to rebuild docs?
I didn't go through every line, but seems about good. I think you need to add skani to the conda yml, and can you enable runs on PR using
on: [push, pull_request]
in the actions yml please?
--cluster-method isn't in the --full-help. Is there somewhere that I missed? Or do I have to rebuild docs?
You added that argument, so won't show up until docs are redployed from main/release.
Add skani as fastani alternative
new
method?)find_representatives
andfind_memberships
back intoclusterer.rs
Clusterer
through above functions so it needs only implementcalculate_ani
get_threshold
method?calculate_skani