Closed BurntSushi closed 1 year ago
My rough prediction is that the pattern of an alloc
-like facade crate is not likely going to leave the mainstream of Rust no_std
+ heap development for general-purpose libraries. Why? Because even though it presently requires conditional imports, it doesn't mandate threading allocator type signatures or otherwise significantly impacting API design.
The details of implementing a global allocator that can be used with the global facade pattern are definitely in flux, but I've not seen dramatic change in how general-purpose libraries consume them recently.
As @BurntSushi rightly pointed out, the primary hassle here is maintaining all of those conditional-compilation flags.
I'll explore some more to see if there are any other tricks available for reducing their proliferation, and tear out cfg-if
while I'm at it, if adding that macro is indeed too burdensome. Perhaps making heavier use of extern crate std as core
could help. I'm definitely open to more ideas.
Some good news is that the approach speculated in my prior comment seems to have borne some dividends, as seen by comparing the first and second commit of the no_std PR to regex-syntax. The original approach added ~230 lines, the latest approach adds only ~100.
Removing the cfg-if
dependency and applying extern crate std as core;
allowed unconditional imports for items found in both std and core (i.e. use core::mem
). Thus, the number of duplicate imports differing only in crate prefix went way down.
Nice. :) To clarify, I could probably stomach cfg-if itself. It is widely used and I trust its caretaker. But I try hard to be conservative here, especially with regex since it is so widely used. In my experience maintaining things, dependencies have generally become a liability. The only reason regex has as many dependencies as it does is because most of them would need to exist internally anyway, and it makes sense to expose them for others to benefit from.
For completeness, I asked around the portability working group about the state of viable alternatives, and was pointed in the direction of the following documents:
The described end-state of thorough and flexible capability-aware portability is extremely appealing. It would encourage a consistent approach to configuration across libraries and probably reduce the redundantly-retargeted-use
clauses to near zero.
That said, the overall approach of using human-applied cfg
flags to tag and track the suitability of various portions of the codebase for targeted use cases and manage imports seems like it would be largely similar.
The working group seems to be in "design and ground-work" phase, clearing out various obstacles but not yet tackling the primary implementation. Thus, at present, I find it difficult to estimate the timeline involved before the vision approaches realization.
After looking around a bit more, it seems like in some ways the community is voting with its commits.
Two other relatively high profile projects in the ecosystem -- rand and nom appear to be moving forward with the "std"-as-default-feature, "alloc"-as-optional-feature approach.
This gives me some confidence that either the strategy has momentum, or the projected maintenance burden for that pattern is tolerable.
Perhaps @Geal or @pitdicker or @dhardy wouldn't mind commenting?
Not sure what the exact issue is to reply to, but I'll try writing something.
I think supporting no_std
certainly added quite some trouble for Rand. And it is not great yet in my opinion, as important functionality is simply not available.
One thing that does help is that Rand doesn't really require allocations. Adding the alloc
feature was easy, compared to error handling, having no thread-local storage (still unsolved), no easy OS interface, and no floating point math functions.
std
.std
feature depends on the alloc
feature, Cargo.toml
.std
and with/without alloc
.no_std
in Contributing.md
Thanks for all the references @ZackPierce; Aaron's "portability vision" article is why I suggested importing from std
where possible in #477.
The size of #477 could probably be smaller still in my opinion, except some more changes may be needed in tests.
As @pitdicker mentions, as a result of this you end up with two test suites to run, i.e. cargo test
and cargo test --no-default-features
(plus cargo test --benches
and maybe more).
CI doesn't need to be as complex as we have in Rand, though if you care about testing several different platforms you may want to take notes from Rand's Travis configuration.
@ZackPierce recently added no_std
+ alloc
support in https://github.com/AltSysrq/proptest/pull/48.
However, the crate depends on regex-syntax
for some of its nice features. But since regex-syntax
doesn't work without std
, those features can't be used with no_std + alloc
for proptest.
I'd like to replicate Zack's work in regex-syntax
specifically. A preliminary analysis tells me that most of that crate's imports only use core
, and so the changes won't need to be extensive.
I'll start working on a PR to this end. :)
@Centril Thanks for the interest. There is already a PR doing this for regex-syntax. https://github.com/rust-lang/regex/pull/477
I've been hoping to get the chance to incorporate the latest recommendations for improvement of that PR. With luck, that'll happen tonight.
Thanks you for chiming in with your experiences and suggestions, @dhardy , @pitdicker , and @Centril .
As of the latest updates based on those ideas, the exemplar PR to the regex-syntax crate now has roughly 25% of the code changes to the extant, operational portion of the codebase as when it was first attempted. Seems like a pretty respectable improvement in the cognitive load cost for maintenance to me.
One other thing I thought about here is the regex's crate's use of the thread_local
crate for cheap, synchronized, dynamic thread locals for caching regex match data. Making thread_local
support a no_std
mode (that is perhaps slower) would probably be the best path, but that looks non-trivial to me, and also fairly subtle. Another approach might be to figure out a simple alternative in the regex crate itself, even if it is not as optimal as what thread_local
does.
If anyone has specific application oriented use cases for regex
in no_std
w/ alloc
, now would be a good time to elaborate on them here: https://github.com/rust-lang/rfcs/pull/2480#issuecomment-401930667
(I don't have any application oriented use cases myself. My goal here was to service others.)
(Idk if it counts as an application oriented use case or not, but proptest
could benefit from making regex-syntax
dependent functionality available to no_std
+ alloc
users)
@Centril I think I would just ask you to push the question forward, since proptest
is itself a library. Who are the people specifically benefiting from proptest
in no_std
+ alloc
environments? What are their use cases? Constraints?
@BurntSushi I redirect to @ZackPierce since they introduced the no_std
+ alloc
support to proptest
:)
Some relevant discussion: https://github.com/AltSysrq/proptest/issues/47
@BurntSushi
We use Rust to write window kernel driver in our product. We want to use regex
to construct our string match rules.
RegexSet
is what we want to use.
Sorry, but that seems off topic for this issue? I don't understand what question, if any, you're asking me. If you need help with something, then please open a new issue and provide as much detail as possible about the problem you're trying to solve and your constraints.
Sorry, I ought to make it cleaner.
I just described our use case for using regex
in no_std
. :smile:
@harryfei Sorry, but I'm going to need a lot more details than that. I'm not a Windows programmer, and certainly have no experience with Windows kernel driver development, so I don't understand what your constraints are.
Presumably the only relevant constraint is whether or not you have alloc
?
In kernel driver development, we must use no_std
feature. Because there is no OS syscall as in the user mode, many std
functions can't be used (just like the embed system).
We can use regex
crate only if it supports no_std
feature.
@harryfei Thanks for elaborating. I think you'll want to monitor https://github.com/rust-lang/rfcs/pull/2480. Once it stabilizes, then this is something I'd be willing to more aggressively pursue.
@harryfei @BurntSushi the alloc crate got stabilized in the last release of rustc so it might be worth pursueing this further as you mentioned before.
Yes I know. It will likely be a while before I look into it. regex has a conservative MSRV.
Yes I know. It will likely be a while before I look into it. regex has a conservative MSRV.
Would it not be possible to use build.rs
to conditionally depend on extern crate alloc;
?
Yes. regex
already does that for things like SIMD. The key concern there is how complex it will make the code. If it's not crazy, then I'd definitely be up for conditionally enabling it.
There has been some interest in putting out a version of regex that doesn't depend on std itself, but instead depends just on alloc. This is within reach because regex already doesn't rely too much on platform specific details, and mostly just depends on dynamic memory allocation. There are however some parts of the regex API that will need to be tweaked. For example, regex uses
std::io::Error
, which isn't available inalloc
. This is why regex 1.0 got ause_std
/std
feature. Namely, compiling regex without that feature fails today. This will allow us to change the semantics of that compilation mode without breaking backwards compatibility.@ZackPierce has been diligently adding support for this by starting with regex's dependencies. So far:
I thought it would be good to track this issue at a higher level so we can discuss a game plan. I'd also like to share some of my thoughts/constraints on the process.
I basically think that we should do this. What I am unsure of is the timeline. For the most part, my own personal maintenance bandwidth is very limited, and to this end, I've generally avoided nightly-only support of things. (I've made some exceptions. Support for things like
Pattern
happened before I knew better, and support for SIMD happened because I am very excited about it and got involved with the SIMD stabilization effort.) Namely, I cannot and will not be beholden to nightly breakages because I simply can't keep up. To that end, I would like to know more about what the plan is forno_std
environments. Is thealloc
crate setup generally where we think we're going? If so, and if it remains relatively stable, I think I could get on board with this relatively soon.It's also worth saying that some changes are simpler than others. For example, making
utf8-ranges
compatible withno_std
is pretty reasonable, but theaho-corasick
changes are quite a bit broader. I have concerns over peppering conditional compilation everywhere, and I think those sorts of things are very hard to maintain. I'm hopeful we can find a better way. This gets worse when the public API is impacted. The complexity and maintenance burden goes way up. This is apparently so bad that it was worth adding a new dependency (cfg-if
) that must be paid for by everyone just to support theno_std
users. I'm not especially excited about that, particularly if the trend continues.For the most part, I really wasn't intending on tackling this feature until more stuff stabilized. But I wanted to get my thoughts out there so that there are no surprises.
I welcome other thoughts on the matter!