Closed RickyDaMa closed 10 months ago
Okay, the crate is now a lot more polished and ready for use. However, I've come across an issue with integrating with norad
specifically, mainly due to limitations in how you can specify features in Cargo.toml
s
Background: close_already
now supports multiple different backends to avoid conflicting threadpools where you're already using something like rayon
, or an async runtime. These backends are chosen from with mutually exclusive backend-blah
features
I presume the rayon
dependency/feature isn't enabled in this crate because rayon
is relatively heavy as dependencies go, which is fair enough. Per close_already
's backend-selection, when we are using rayon
, I want to activate the backend-rayon
feature, so it doesn't have a separate threadpool 'fighting' with rayon
's
That's all well and good, until we want to make close_already
default when rayon isn't enabled (given that's where the biggest gains are), but using a lighter backend (the SLoC quote from before was using the now-named backend-threadpool
)
An attempt at this setup looks something like this:
[dependencies]
# -- snip --
rayon = { version = "1.3.0", optional = true }
# Note: no backend enabled here
close_already = { version = "0.3", default-features = false }
[features]
# By default, enable the backend-threadpool feature of close_already
default = ["close_already/backend-threadpool"]
# The "rayon" feature enables the dependency rayon and activates the backend-rayon feature
rayon = ["dep:rayon", "close_already/backend-rayon"]
The problem now is that building with the rayon feature has two backends enabled, resulting in close_already
failing to compile. There is no way in the Cargo.toml
to specify "de-activate backend-threadpool when the rayon feature is enabled". (Tangent: mutually exclusive features don't even exist at a Cargo.toml
level, you have to detect it at compile time in source code and compile_error!
, which is ugly and awkward)
Potential fixes:
no-default-features = true
when they want to use norad
with the rayon featureclose_already
in either no-rayon or rayon mode, but not both. This however then requires #[cfg]
flags throughout the code as close_already
isn't always presentbackend-rayon
(at that point you could just remove the rayon feature and make it default, since rayon
will be compiled in anyway)close_already
somehow to resolve which backend you actually want if you enable multiple (how??)Interested in thoughts, feelings, and emotions from people. If we want more reproducible benchmarks to make a decision based on close_already
's value proposition, I'm happy to get something put together for that and share it here
Ran the numbers:
default (baseline):
read & parse Roboto-Regular.ufo
time: [223.82 ms 224.33 ms 224.83 ms]
write Roboto-Regular.ufo
time: [2.0756 s 2.0973 s 2.1211 s]
close_already (rayon):
read & parse Roboto-Regular.ufo
time: [231.55 ms 232.02 ms 232.50 ms]
write Roboto-Regular.ufo
time: [987.44 ms 1.0257 s 1.0691 s]
close_already (threadpool):
read & parse Roboto-Regular.ufo
time: [223.36 ms 224.00 ms 224.68 ms]
write Roboto-Regular.ufo
time: [975.15 ms 1.0152 s 1.0596 s]
---
rayon (baseline):
read & parse Roboto-Regular.ufo
time: [28.253 ms 28.368 ms 28.492 ms]
write Roboto-Regular.ufo
time: [867.16 ms 922.49 ms 985.35 ms]
close_already (rayon) + rayon:
read & parse Roboto-Regular.ufo
time: [28.643 ms 28.820 ms 29.010 ms]
write Roboto-Regular.ufo
time: [831.17 ms 871.48 ms 915.87 ms]
close_already (threadpool) + rayon:
read & parse Roboto-Regular.ufo
time: [28.395 ms 28.495 ms 28.603 ms]
write Roboto-Regular.ufo
time: [830.45 ms 870.99 ms 914.95 ms]
Takeaways:
close_already
😎 Conclusion: I'll PR always-on close_already
using backend-threadpool
(the lightweight default) for writes only
So this is tricky, and it's a problem I've run into before.
Basically: cargo features are intended to be additive. That is to say, a feature is not intended to be used to switch between impls, and enabling a feature should never 'turn off' any existing code, only 'turn on' something.
One possible solution would be to allow your crate to compile two backends, but have one of them just be unused, if rayon is available? We'll still compile it, but the generated code should at least be pruned during dead code analysis. (I believe that this is the conclusion you've reached anyway?)
There's also definitely an argument for just always enabling rayon in norad. I have historically tried to be judicious in my use of dependencies, and that is the rationale for having rayon behind a feature; there are patterns of using norad where you aren't relying on us to load everything, in which case rayon would be unnecessary, but maybe I could also just relax and leave it on, if that would be a good simplification.
One possible solution would be to allow your crate to compile two backends, but have one of them just be unused, if rayon is available?
This would require a fairly intensive refactor of close_already
, and I think would open up a new question: if multiple backends are selected, how do you know which one is the one to use?
I believe that this is the conclusion you've reached anyway?
The conclusion I reached was that using the backend-threadpool
didn't see a meaningful performance difference to backend-rayon
, so I'm just using backend-threadpool
everywhere since it is lightweight (see f2478e3f3d7654f8fe20bd4f2100bf8411b37b41)
I have historically tried to be judicious in my use of dependencies, and that is the rationale for having rayon behind a feature; there are patterns of using norad where you aren't relying on us to load everything
Yeah this is 100% valid and I approach my own Rust projects the same way: aggressively minimise dependencies and their features wherever possible. I think whether rayon
should be default is up for debate (but outside the scope of this issue), but having is be optional is a good thing, for more streamlined use cases
Full disclosure, this is basically an advert for a little crate I wrote over the weekend:
close_already
In short, by using my crate (under 1000 SLoC including the two transitive dependencies*) I was able to speed up norad (without rayon) writing a big UFO to disk by 67%, and a further ~10% when the rayon feature was enabled (despite a suboptimal implementation with conflicting threadpools). You can read more about how this works in the crate's README
I still need to benchmark read performance for exact numbers, but there should also be performance improvements there as well
I've made the library work as a no-op on other compile targets and tried to keep it simple & uninvasive to integrate, so I was wondering if
norad
would be interested in a PR to show what this looks like?* EDIT: when using the default
backend-threadpool