rust-lang / regex

An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
https://docs.rs/regex
Apache License 2.0
3.52k stars 439 forks source link

regex-lite: make the std feature optional #1122

Open d3zd3z opened 11 months ago

d3zd3z commented 11 months ago

regex-lite has std feature, but t doesn't compile

I was considering regex-lite in an embedded system. However, trying to compile with std not enable generates a compiler_error!("'std' is currently a required feature, please file an issue");

BurntSushi commented 11 months ago

Yes... That was intentional. That way, it can be made optional in a semver compatible release. If it didn't have a std feature, then someone could use default-features = false prior to its addition, and then get broken after it's addition (since missing std usually corresponds to less APIs, such as std::error::Error impls).

I don't think I personally will work on this any time soon, but patches are welcome.

I was considering regex-lite in an embedded system.

Note that regex-lite will still require alloc. It can't be core-only.

d3zd3z commented 11 months ago

I did get it to build alloc-only, but it wasn't particularly clean. Basically, I made Arc actually Rc, and remove the Send+Sync constraint on CachePoolFn, and instead of the Mutex in pool.rs, I make a fake one out of a RefCell. So, it does work, but ultimately, I think it is just going to be too big for this kind of environment. (the heap was using about half of my available RAM).

It's amazing how small a few hundred KB of RAM seems these days.

BurntSushi commented 11 months ago

You shouldn't need to do anything with Arc since it's available in alloc: https://doc.rust-lang.org/alloc/sync/struct.Arc.html

The pool is indeed probably the trickiest part. We'll probably need to introduce unsafe and create a spin-lock or something. We'll likely want to copy the alloc-only implementation of a pool from regex-automata.

With that said, if you're really in that constrained of an environment, have you tried using regex-automata to build a DFA and serialize it to disk? Then you can deserialize it with regex-automata even in no-std no-alloc environments.

jorolf commented 8 months ago

Arc isn't always available in alloc if the platform doesn't support atomics natively: https://doc.rust-lang.org/alloc/sync/index.html

I'm currently working on a project for a risc-v processor and I'm unable to build regex-automata exactly because of that. I think the portable-atomic crate could be used as a drop-in replacement for targets without atomics. Could that be added as an optional feature to this library?

BurntSushi commented 8 months ago

This is about regex-lite which has nothing to do with regex-automata. Please open a new issue.

BurntSushi commented 8 months ago

@jorolf And also, portable-atomic doesn't have an Arc. IMO, probably the right path is for someone to get Arc available on the target you care about. Otherwise, I don't see any of the crates in this repo working for you. One possible alternative is to use #[cfg(target_has_atomic = "ptr")] to detect when Arc isn't available, and when it isn't, use a Box<T> instead. You might end up paying more costs for cloning and what not, but it should work. However, I don't have any plans to work on that any time soon.

BurntSushi commented 8 months ago

And are you doing this on an embedded system? The regex-automata crate is a giant beast. You might have better luck with its no-std/no-alloc mode and compiling DFAs offline.

jorolf commented 8 months ago

Hmm nvm then. I was working on reusing parts of a library in a no-std environment but I didn't really check which dependencies are needed for my usecase. Maybe I can make it work some other way.

Fwiw, there is a sub-crate called portable-atomic-util which adds some support for Arcs.

PS: I felt this issue was appropriate because regex-lite also depends on Arcs?

BurntSushi commented 8 months ago

PS: I felt this issue was appropriate because regex-lite also depends on Arcs?

Possibly, but your initial comment was talking about regex-automata, which made it overall very confusing in an issue about regex-lite.

This section might help you: https://github.com/rust-lang/regex/blob/master/regex-cli/README.md#example-serialize-a-dfa

It shows how to use a DFA regex in no-std/no-alloc core-only mode.