tokio-rs / bytes

Utilities for working with bytes
MIT License
1.91k stars 288 forks source link

Support `atomic-polyfill` for targets without native atomics #573

Open iostat opened 2 years ago

iostat commented 2 years ago

Hi!

It's awesome that this crate is no-std (it's actually being used in some firmware!) -- but we've found that when compiling for architectures without native atomics support such as thumbv6m-none-eabi (i.e., Cortex-M0 such as in the Atmel ATSAMD21 and Raspberry Pi RP2040), well, we can't compile...

While the core::sync::atomic modules/types exist in those targets, they don't actually support compare_exchange/fetch_add/fetch_sub as are used by this crate.

The awesome folks working on Embassy actually have an atomic-polyfill crate for this exact scenario, this PR adds a feature gate to use it, allowing one to build this crate for even more no-std targets.

notgull commented 2 years ago

There is also a portable-atomic crate that implements atomic operations without using an external critical-section at all. IIRC either approach would work.

taiki-e commented 2 years ago

See https://github.com/tokio-rs/bytes/issues/461 for previous discussions.

iostat commented 2 years ago

Don't know how I missed that discussion! I see #467 uses portable-atomics -- honestly I see it as preferable to this PR since it seems to have more extensive test coverage too. Is there anything really blocking #467 from being merged?

Darksonn commented 2 years ago

Not sure if there are any blockers, but that PR is marked as a draft, so I haven't reviewed it.

iostat commented 2 years ago

@taiki-e is there any way I can help make #467 for review?

taiki-e commented 2 years ago

Is there anything really blocking https://github.com/tokio-rs/bytes/pull/467 from being merged?

The situation has changed since that PR was opened and I'm wondering about how to update it.

The main issue is how targets without CAS are handled with default features.

Perhaps it would be preferable to start with the third and then select the first or second in the future if necessary?

Dirbaio commented 1 year ago

On atomic-polyfill vs portable-atomic: The reason atomic-polyfill depends on critical-section is so that it can work in more targets.

portable-atomic can polyfill CAS only for single-core chips running in privileged mode, by disabling interrupts. If your chip is multicore (example: the Raspberry Pi Pico RP2040 is), or you're running on unprivileged mode, or due to real-time requirements you can't disable interrupts (example) then you can't use portable-atomic.

atomic-polyfill calls out to critical-section instead, which allows any crate to supply a correct implementation for the current target. The cortex-m crate provides a "disable interrupts" implementation (equivalent to what portable-atomic's, also sound only on single-core privileged mode), but other crates in the ecosystem have implementations that work for more targets. For example rp2040-hal provides one for the Raspberry Pi Pico RP2040, or nrf-softdevice provides one that disables only application interrupts in order to not interfere with the radio's hard realtime requirements.

It's not feasible to add support for every single piece of hardware out there to atomic-polyfill itself. critical-section was designed to solve this problem.

taiki-e commented 1 year ago

Yeah, the ability to support different platforms and environments is a big advantage of critical-section, and there was some discussion about supporting critical-section in portable-atomic before (https://github.com/taiki-e/portable-atomic/issues/26), which was postponed because critical-section 1 was not released at the time and there is an alternative. But now that critical-section 1 has been released, I think it would make sense to support it.

As for comparing atomic-polyfill and portable-atomic, we discussed it recently in https://github.com/taiki-e/portable-atomic/issues/39, but I don't think it is preferable for common libraries to be dependent on atomic-polyfill yet. (https://github.com/taiki-e/portable-atomic/issues/39#issuecomment-1249986810, https://github.com/taiki-e/portable-atomic/issues/39#issuecomment-1250016715).

Dirbaio commented 1 year ago

I don't agree with your argument on the use of Cargo features. The argument is: with Cargo features any library could incorrectly enable it, leading to unsoundness in the final binary, while --cfg's enforces it can only be enabled by the end user. However:

Also note that neither atomic-polyfill or critical-section mandate the use of Cargo features over --cfgs for this. The choice is up to the crates providing the impls. For example cortex-m could have opted to require a --cfg instead of a Cargo feature to set the "disable all interrupts" impl, it's just that it opted not to.

taiki-e commented 1 year ago
  • Cargo features are the standard way to configure crates at compile-time. --cfg's are very nonstandard.

If the cargo feature is enabled somewhere in the dependency tree, it will be enabled in the entire dependency tree.[^1] This property is not a problem for safe, stable, and additive features, and since most features are safe and stable, it makes sense that cargo features are used as the standard way.

However, due to this property, in my opinion, features that may affect soundness and stability should be cfg, not cargo features. This is also the approach actually adopted by some of the very popular crates in the ecosystem. For example:

https://docs.rs/time/0.3.4/time/#feature-flags:

One pseudo-feature flag that is only available to end users is the unsound_local_offset cfg. As the name indicates, using the feature is unsound, and may cause unexpected segmentation faults. Unlike other flags, this is deliberately only available to end users; this is to ensure that a user doesn’t have unsound behavior without knowing it. To enable this behavior, you must use RUSTFLAGS="--cfg unsound_local_offset" cargo build or similar.

https://docs.rs/proc-macro2/1.0.45/proc_macro2/index.html#unstable-features:

Note that this must not only be done for your crate, but for any crate that depends on your crate. This infectious nature is intentional, as it serves as a reminder that you are outside of the normal semver guarantees.


  • There are use cases where it is indeed correct and desirable for a lib crate to enable such a feature. For example, Board Support Crates (BSPs), or RTOS support crates, that do have complete information on what the target environment is. For example the Raspberry Pi Pico BSP enables the rp2040-specific critical-section impl by default.

If a crate is platform-specific, I think it is reasonable to enable the feature by default, since it is clear that the user intends the code to be platform-specific when depending on that crate.


The critical-section readme is very explicit on what libs are/aren't supposed to do

That section says:

Do not add any dependency supplying a critical section implementation. Do not enable any critical-section-* Cargo feature. This has to be done by the end user, enabling the correct implementation for their target.

I often see such caveats in libraries with mutually exclusive features, but I have seen many cases where the library has enabled one of them...

In critical-section's case, I think direct users are aware of this, but I question whether users who indirectly depend on critical-section are aware of it.

For example, cortex-m provides the critical-section implementation as an optional feature, but it does not seem to inherit the caveat that the library should not enable it, and users may not be aware of that caveat unless they go read the critical-section documentation. (And it would be unrealistic to expect all libraries that provide critical-section implementations to inherit that caveat. And when using cfg, you can force that only the end user can enable the feature, regardless of whether or not the user has read and followed the indirect dependency documentation.)

[^1]: Understanding which features are finally enabled needs additional work. Also, this property is especially troublesome with v1 resolvers, where features can be integrated with other kind of dependencies, other targets, etc.

Dirbaio commented 1 year ago

IIUC, the time case is not quite the same, it's for enabling a feature that's always unsound and there's no possible fix available. And even then, the cfg solution seems odd to me, they could've moved the affected API to unsafe fn s instead. The standard Rust way to opt into unsoundness is unsafe{}.

The proc_macro2 case is completely different, it's for opting out of semver guarantees, not about soundness. That one makes sense IMO.

If a crate is platform-specific, I think it is reasonable to enable the feature by default, since it is clear that the user intends the code to be platform-specific when depending on that crate.

My point is, if enabling the critical-section impl in rp2040-hal required a cfg, the rpi-pico crate couldn't do it on behalf of the user. therefore using features leads to better usability than cfgs.

For example, cortex-m provides the critical-section implementation as an optional feature, but it does not seem to inherit the caveat that the library should not enable it,

It's implied by "you should only enable this if the target is single-core". If the lib doesn't know the target is single-core, it is a logic error in the lib's part to enable it. I agree the docs could be improved though.

Also, in practice, it's not likely lib authors end up enabling a CS impl without fully knowing the consequences to workaround build failures. This is because CS impls don't add any public API. A lib using critical-section (even indirectly) builds just fine with cargo check or cargo build without any CS impl. It's only when linking the final binary that you get a linker error.

Compare with portable_atomic_unsafe_assume_single_core in portable-atomic, which does add public API (CAS operations), so a lib author might be tempted to enable it to get cargo check --target thumbv6m-none-eabi to pass.

taiki-e commented 1 year ago

IIUC, the time case is not quite the same, it's for enabling a feature that's always unsound and there's no possible fix available.

The proc_macro2 case is completely different, it's for opting out of semver guarantees, not about soundness.

I gave one example of each of cfg affecting soundness and stability, and did not intend to compare them to ours. That said, I agree that both cases are different from ours.

And even then, the cfg solution seems odd to me, they could've moved the affected API to unsafe fn s instead. The standard Rust way to opt into unsoundness is unsafe{}.

I'm not sure if making the affected API unsafe would have solved the problem. unsafe API usually has a way to use the API in a sound way, but in local_offset's case, it seems that whether SIGSEGV occurs depends on linked c/c++ dependencies, etc., and it is doubtful that the direct caller of the API can guarantee safety.

This is because CS impls don't add any public API.

Even if the library's API did not have the CS API, I believe CS would always be considered a public dependency of the library, since updating the CS version or removing CS impls or dependency on CS could break downstream builds.[^1]

Also, in practice, it's not likely lib authors end up enabling a CS impl without fully knowing the consequences to workaround build failures. This is because CS impls don't add any public API. A lib using critical-section (even indirectly) builds just fine with cargo check or cargo build without any CS impl. It's only when linking the final binary that you get a linker error.

I ofen see cases where a feature used only in testing is declared in a dependency rather than a dev-dependency, so I would not be surprised if someone accidentally enables it in a library.

Compare with portable_atomic_unsafe_assume_single_core in portable-atomic, which does add public API (CAS operations), so a lib author might be tempted to enable it to get cargo check --target thumbv6m-none-eabi to pass.

When using portable-atomic, cfg is indeed required for builds on such targets, but there is no risk that how the cfg is set up will affect the downstream.

[^1]: In portable-atomic's case, removing dependency on the portable-atomic could break downstream builds, but the cfg interface is kept between versions, so updating the portable-atomic is designed to not break downstream builds unless the portable-atomic types are exposed in the library's API.

Dirbaio commented 1 year ago

Even if the library's API did not have the CS API, I believe CS would always be considered a public dependency of the library, since updating the CS version or removing CS impls or dependency on CS could break downstream builds.

This is orthogonal to the "feature vs cfg" decision.

I ofen see cases where a feature used only in testing is declared in a dependency rather than a dev-dependency, so I would not be surprised if someone accidentally enables it in a library.

The readme says pretty clearly what one should do in that case. Also, it's very unlikely a lib author could do that mistake. For example, enabling critical-section/std if you want your lib to work on nostd, or adding a dependency on cortex-m if the lib is supposed to be cross-target are very very obviously wrong.

When using portable-atomic, cfg is indeed required for builds on such targets, but there is no risk that how the cfg is set up will affect the downstream.

My argument was: if enabling the "dangerous" feature adds methods/structs to the public API it's more likely that lib authors might be tempted to enable it. Critical section impls add no methods/structs to the public API, so it's much less likely that lib authors will be tempted to.

...

anyway, my opinion overall is the argument in favor of cfg's is a very theoretical "lib authors might make a mistake", which a) the likelihood is very low IMO and b) lib authors can already make other mistakes that cause unsoundness anyway. while the argument against cfg's is they hurt usability A LOT (BSPs can't enable a CS impl, user has to set nonstandard flags, mess with .cargo/config.toml...), for EVERYONE, all the time, which is a much more practical and "real" concern. Therefore using cfgs is not worth it IMO.