kno10 / rust-kmedoids

k-Medoids clustering in Rust with the FasterPAM algorithm
GNU General Public License v3.0
21 stars 4 forks source link

Feature Gate the Algorithms #11

Closed abstractqqq closed 4 days ago

abstractqqq commented 6 days ago

Hi, it is great that the crate provides a lot of algorithms, but I am wondering if you can feature-gate the various algorithms?

E.g. kmedoids = {version = "0.5.2", features = ["FastPAM1", "PAM"]} to also include FastPAM1 and PAM algorithms.

And maybe keep FasterPAM, FasterMSC, DynMSC as defaults.

This will help reduce the size of crate inside other projects, and will help users find the right algorithms to use in most cases. In my experience, too many choices often lead to confusion.

For some info on Rust crate features, you can see here

I know that probably a lot of algorithms share some code, but still, I think it is better in terms of organization to feature-gate some algos. Please let me know if you plan to do so.. If not, I can try and help you do it.

kno10 commented 6 days ago

Are you sure that this makes a difference? They probably cannot be compiled without the type traits fixed, so it should not have any effect.

kno10 commented 5 days ago

But does this actually yield any improvement. The unused algorithm variants will not end up in a binary when they are unused. I assume the Rust compiler already is smart enough to recognize this, otherwise the linker will. If you compile your code with and without these feature gates, is there any size difference? How much does compilation time change? These feature gates may be "premature optimization" if you do not have measurable benefits.

abstractqqq commented 4 days ago

This is an old problem I encountered and people discussed it at various places and I find this most insightful: https://internals.rust-lang.org/t/unused-dependency-code-can-compiler-performance-be-improved-by-doing-less-work/16201

In short if I depend on kmedoids crate then without feature flags the entire crate will be compiled. Unused functions inside my lib will not be compiled.

In terms of actual improvements, this is small since the entire package is already small. Just sharing some of my experience and preferences here and it is totally up to you to decide

kno10 commented 4 days ago

As you mentioned already all these functions are really small. Plus, they cannot be compiled until the traits are determined, so I'm pretty sure that they are not fully compiled ahead of usage.