frewsxcv / rust-crates-index

Rust library for retrieving and interacting with the crates.io index
https://docs.rs/crates-index/
Apache License 2.0
72 stars 37 forks source link

minor Names optimizations #145

Closed ToBinio closed 1 year ago

ToBinio commented 1 year ago

This PR adds some small improvements to Names

Byron commented 1 year ago

Thanks a lot, great work!

Especially after improving the docs once more it's clear that there should be an example with Names::new(name).take(3) because it would be the shortest sequence of names with the highest probability of a hit, while all other permutations are likely to fail while incurring quite high costs.

You could even check the dataset and see how many crates there actually are that have inconsistent hyphens and underscores. If it's near none, I think the default-implementation of Names should change to list only three variants by default, while offering a flag to 'unlock' all other permutations - after all, those would be more of a footgun for performance.

ToBinio commented 1 year ago

You could even check the dataset and see how many crates there actually are that have inconsistent hyphens and underscores. If it's near none, I think the default-implementation of Names should change to list only three variants by default, while offering a flag to 'unlock' all other permutations - after all, those would be more of a footgun for performance.

with a simple check, I found 669 so there are some...

but almost 100_000 of the 120_000 crates have either 0 or 1 separators. and only around 5_000 have more than 2 so I am not sure if its worth

random fun fact I discovered: almost 200 of these are google-* ones. (which I believe are mostly owned by you 😂)

Byron commented 1 year ago

almost 200 of these are google-* ones. (which I believe are mostly owned by you 😂)

I own a lot of hyphens I guess ;).

Fair enough! What about an example that just happens to use take(3) to give users some ideas? I know, I know, I am soliciting contributions, apologies 😅.

ToBinio commented 1 year ago

would you still prefer a doc-test or what do you think about adding a small tool to the examples which asks for user input checks if it has the crate locally (sparse then git) if not it fetches via sparse (with .take(3)) and prints the crate + its versions or something

Byron commented 1 year ago

Doc-tests are great for people wanting to use the, for now totally detached, Names type, but a more 'real-world' application-like example is certainly neat as well.

Maybe from the example, it's easy to extract a few bits and turn them into a doc-test later as well. In the end, it's about what you prefer, too, more docs and examples are better either way :). Thank you.