cranelift-codegen no longer builds with no_std

IcyDefiance commented 4 years ago

I've tracked down a few things that are preventing this from happening.

packed_struct requires std by default
There are a few uses of std in cranelift-codegen and cranelift-codegen-shared that can be replaced with core or alloc.
thiserror requires std, because the Error trait doesn't exist in core or alloc.

The first two are trivial to fix, and I'll create a PR with those changes in a second.

However, there are 11 uses of the Display and Into<result::CodegenError> traits that prevent me from placing thiserror behind a feature gate, and I'm not sure what to do about them.

Here are the 11 things that rely on those traits, as reported by rustc (click to expand)

$ cargo build --no-default-features --features core Compiling cranelift-codegen v0.51.0 (/home/icydefiance/Documents/code/cranelift/cranelift-codegen) error[E0599]: no method named `to_string` found for type `verifier::VerifierError` in the current scope --> cranelift-codegen/src/print_errors.rs:218:36 | 218 | writeln!(w, "; error: {}", err.to_string())?; | ^^^^^^^^^ method not found in `verifier::VerifierError` | ::: cranelift-codegen/src/verifier/mod.rs:138:1 | 138 | pub struct VerifierError { | ------------------------ method `to_string` not found for this | = help: items from traits can only be used if the trait is implemented and in scope = note: the following trait defines an item `to_string`, perhaps you need to implement it: candidate bytecodealliance/cranelift#1: `alloc::string::ToString` error[E0599]: no method named `to_string` found for type `result::CodegenError` in the current scope --> cranelift-codegen/src/print_errors.rs:227:13 | 227 | err.to_string() | ^^^^^^^^^ method not found in `result::CodegenError` | ::: cranelift-codegen/src/result.rs:12:1 | 12 | pub enum CodegenError { | --------------------- method `to_string` not found for this | = help: items from traits can only be used if the trait is implemented and in scope = note: the following trait defines an item `to_string`, perhaps you need to implement it: candidate bytecodealliance/cranelift#1: `alloc::string::ToString` error[E0277]: `verifier::VerifierError` doesn't implement `core::fmt::Display` --> cranelift-codegen/src/verifier/mod.rs:235:33 | 235 | writeln!(f, "- {}", err)?; | ^^^ `verifier::VerifierError` cannot be formatted with the default formatter | = help: the trait `core::fmt::Display` is not implemented for `verifier::VerifierError` = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead = note: required because of the requirements on the impl of `core::fmt::Display` for `&verifier::VerifierError` = note: required by `core::fmt::Display::fmt` error[E0277]: the trait bound `result::CodegenError: core::convert::From` is not satisfied --> cranelift-codegen/src/context.rs:224:30 | 224 | self.verify(fisa)?; | ^ the trait `core::convert::From` is not implemented for `result::CodegenError` | = note: required by `core::convert::From::from` error[E0277]: the trait bound `result::CodegenError: core::convert::From` is not satisfied --> cranelift-codegen/src/context.rs:244:39 | 244 | self.verify_locations(isa)?; | ^ the trait `core::convert::From` is not implemented for `result::CodegenError` | = note: required by `core::convert::From::from` error[E0277]: the trait bound `result::CodegenError: core::convert::From` is not satisfied --> cranelift-codegen/src/regalloc/context.rs:111:28 | 111 | return Err(errors.into()); | ^^^^^^^^^^^^^ the trait `core::convert::From` is not implemented for `result::CodegenError` | = note: required because of the requirements on the impl of `core::convert::Into` for `verifier::VerifierErrors` error[E0277]: the trait bound `result::CodegenError: core::convert::From` is not satisfied --> cranelift-codegen/src/regalloc/context.rs:139:28 | 139 | return Err(errors.into()); | ^^^^^^^^^^^^^ the trait `core::convert::From` is not implemented for `result::CodegenError` | = note: required because of the requirements on the impl of `core::convert::Into` for `verifier::VerifierErrors` error[E0277]: the trait bound `result::CodegenError: core::convert::From` is not satisfied --> cranelift-codegen/src/regalloc/context.rs:168:28 | 168 | return Err(errors.into()); | ^^^^^^^^^^^^^ the trait `core::convert::From` is not implemented for `result::CodegenError` | = note: required because of the requirements on the impl of `core::convert::Into` for `verifier::VerifierErrors` error[E0277]: the trait bound `result::CodegenError: core::convert::From` is not satisfied --> cranelift-codegen/src/regalloc/context.rs:196:28 | 196 | return Err(errors.into()); | ^^^^^^^^^^^^^ the trait `core::convert::From` is not implemented for `result::CodegenError` | = note: required because of the requirements on the impl of `core::convert::Into` for `verifier::VerifierErrors` error[E0277]: the trait bound `result::CodegenError: core::convert::From` is not satisfied --> cranelift-codegen/src/regalloc/context.rs:239:28 | 239 | return Err(errors.into()); | ^^^^^^^^^^^^^ the trait `core::convert::From` is not implemented for `result::CodegenError` | = note: required because of the requirements on the impl of `core::convert::Into` for `verifier::VerifierErrors` error[E0277]: the trait bound `result::CodegenError: core::convert::From` is not satisfied --> cranelift-codegen/src/regalloc/context.rs:248:17 | 248 | Err(errors.into()) | ^^^^^^^^^^^^^ the trait `core::convert::From` is not implemented for `result::CodegenError` | = note: required because of the requirements on the impl of `core::convert::Into` for `verifier::VerifierErrors` error: aborting due to 11 previous errors Some errors have detailed explanations: E0277, E0599. For more information about an error, try `rustc --explain E0277`. error: could not compile `cranelift-codegen`. To learn more, run the command again with --verbose.

Additional info:

The author of thiserror doesn't want to add a feature-gate for just the Error trait, but it's possible to provide another implementation of Display (and presumably Into). dtolnay/thiserror#43
There was some discussion of moving Error to alloc, and a PR was even created to do so, but it was closed because RFC 2504 will add backtrace functionality that requires std.

IcyDefiance commented 4 years ago

In addition to that PR, I started addressing point 3 on the no_std-thiserror branch on my fork. (diff) Building that branch will give you the same errors that I pasted in the above post.

IcyDefiance commented 4 years ago

I got cranelift-wasm and its dependencies (including cranelift-codegen) to compile with no_std in my fork. To do so, I had to fork thiserror and v0.48.2 of wasmparser (there have been breaking changes since then).

I also copied the extern crate alloc as std; line from cranelift-wasm to cranelift-codegen, and eliminated all direct use of alloc there. (It also reduced the changes I had to make to thiserror.) Unfortunately I don't think there's a way to do the same thing with core.

Considering the current lack of interest in supporting no_std, as expressed in my PR, I won't bother making any new PRs. However, they're good enough for the project I'm playing with, so they may also be useful to someone else, and they may be informative if there is any future interest in supporting no_std.

iximeow commented 4 years ago

I happen to know of at least one project where building cranelift-wasm as no_std is really helpful, so I just want to say I appreciate this work.

bnjbvr commented 4 years ago

Sorry we didn't get back to you earlier, things have been a bit hectic lately, and no-std support is not checked in automation, so it's definitely something we may break a lot. The main issue here was finding a proper core reviewer for your work; if there's someone who could chime in and review such changes, it would be definitely helpful. And then, we could reopen your PR here and in other places and take your changes, if you were so inclined.

bnjbvr commented 4 years ago

(Of course, that also depends on the thiserror changes being properly integrated upstream, which might be another hill to climb.)

IcyDefiance commented 4 years ago

By "lack of interest" I was referring to Alex's comment in my PR, but if that's not the case then I can make some new PRs later today. I'll start with thiserror and wasmparser and see how it goes.

bnjbvr commented 4 years ago

It's nice to be able to experiment on a smaller crate like wasmparser (since we're the same authors), to have a precise idea of what the code would look like in the end, what's the best strategy to implement this (use no-std-compat or something else). Once that's done, we agreed we could also port the whole Cranelift crate to no-std support, using the methods learned during the port of wasmparser. Hope this makes sense!

yurydelendik commented 4 years ago

It's nice to be able to experiment on a smaller crate like wasmparser

Opened https://github.com/bytecodealliance/wasmparser/issues/198 . I'm proposing not depend on structures that require memory allocations (this means no hashbrown )

... using the methods learned during the port of wasmparser

These ideas might not be applied here

Lucky4Luuk commented 5 months ago

Is this still being worked on? What are the current issues blocking no_std support?

alexcrichton commented 5 months ago

I'm at least not personally aware of this being worked on, but no blocker other than "just needs some elbow grease". If you'd be up for it PRs would be most welcome!

marceline-cramer commented 4 months ago

I made a pretty half-assed attempt at this but managed to dig up some useful info: a major blocker to getting this to work is getting several dependencies involved in std::io::Write-ing to objects to work on no_std. Here's a non-exhaustive list (i've been doing this for hours and i am exhausted enough):

the leb128 crate used by wasm-encoder has a hard dependency on std::io
the termcolor crate has a hard dependency on std::io
indexmap/std is enabled by wasmparser/std is enabled by wasmprinter (which has no features that could be used to disable the dep feature)
gimli/write has a hard dependency on std::io
object/write has a hard dependency on std::io

All of those issues come from the wasmtime-environ crate with the compile feature, which as far as I can tell, is pretty necessary to the functioning of wasmtime-cranelift. I don't know how necessary that is, but if it is absolutely crucial to have compile working, those crates need to be either factored out, guarded behind std, or in the worst case, modified upstream to not use std::io. Much bureaucracy ensues.

The really obnoxious part about removing std::io from those crates is that there is no equivalent io API in core or alloc, making ergonomic no_std a great endeavor per project.

Aside from wasmtime-environ, wasmtime-codegen/unwind has a similar breaking dependency on gimli/write, which uses std::io to write objects. Disabling the unwind feature also causes a cascade of missing API items in the wasmtime-codegen crate AND wasmtime-cranelift. Today, I did manage to get wasmtime-codegen compiling locally in no_std but because I hadn't gotten the unwind feature to work I had to give up.

thread_local also breaks in wasmtime-codegen because that's std-only. I managed to take care of that with nightly #[thread_local] for now but obviously that's not optimal. More research is needed.

OnceLock can be replaced with once_cell::OnceCell trivially, although that uses the scary-sounding critical-section feature. More research is needed.

I haven't tried to get thiserror working yet but I will.

I would really appreciate some guidance for what to work towards next. This is clearly going to take a LOT of work to get wasmtime-cranelift working on no_std consistently on all platforms but I'm pretty invested in it. Also, if I've made any mistakes in my probing (I'm a newcomer to the codebase), please let me know.

bjorn3 commented 4 months ago

#[thread_local] is generally unavailable on no_std targets too.

alexcrichton commented 4 months ago

My recommendation would be to first start off by gating "easy" things behind on-by-default Cargo features. For example wasmprinter doesn't need to be included in no_std builds (nor thread_local usage or termcolor). Basically anything that's optional functionality can be hard-disabled at compile time and that can greatly help to prune the dependency tree of crates that don't support no_std.

After that next bit that I would recommend is to do this incrementally. The main crate in question I believe is cranelift-codegen but there are other crates as well that it depends on. I'd recommend starting at the leaves (e.g. cranelift-entity) and adding a "ratchet" to add support for no_std to ensure they don't regress. For example on CI you'd add lines around here where that ensures that crates build on a no_std target. Eventually as you work your way up to cranelift-codegen you'll already have added support to a number of other crates and handled a number of these issues.

At that point I think it might be worth reevaluating taking stock of what's remaining. I think it would be useful to classify dependencies as you've done, but with a refreshed list at that point. Some of these are going to be somewhat hard like gimli/write and object/write but we might be able to work with upstream to have new crate versions that support no_std needs, but that also sort of depends on what exactly Cranelift needs. That's why I'd recommend doing everything except cranelift-codegen first if possible so we can get the most precise picture of what Cranelift needs from its dependencies with no_std.

marceline-cramer commented 3 months ago

I'm having a hard time with removing thread_local usage from cranelift/codegen/src/timing.rs. What alternative to thread_local statics should I use with no_std, if any? As far as I can tell, the timing module and those statics can't be configured out. One idea I have is to just configure out all the timing usage with API-compatible no-op stub functions on no_std, but I don't know enough about the codebase to understand the consequences of something like that.

I should also point out that the set_thread_profiler() function is not used anywhere in the codebase.

marceline-cramer commented 3 months ago

For now, I just put no-op stubs in the timing API when the timing feature is not enabled and made the timing feature enable the std feature automatically. thread_local! {} is still in use as long as the timing feature is enabled.

Right now, my development approach is to hack on all of these intersecting API changes and changes to the dependency trees on a fork, which I have set up to be running the full CI PR test suite: https://github.com/marceline-cramer/wasmtime/pull/1 Once that branch on my fork is to a place I'm happy with and all the pieces fit together practically as a rough sketch, I can start decomposing it into finer-grained, more polished things to implement as upstream-facing PRs. I did implement the "ratcheting" approach to get down to cranelift-codegen as quickly as I could and discovered all but one of its workspace dependencies already work fine in no_std. The one exception was cranelift-control, which to compile to no_std I slapped in a Git dep for a pending PR on arbitrary with no_std support: https://github.com/rust-fuzz/arbitrary/pull/177

I added a field to CodegenOptions in cranelift-isle to emit no_std-compatible code by adding a short prelude to generated source files, because most CI compiles ISLE with std but cranelift-codegen does not. If that's an acceptable change for upstream, I'd be happy to make a little PR to get that moving along.

The reason I had gotten as far with cranelift-codegen as to be hacking on thread_local is because getting it to compile to a no_std target turned out to be pretty easy. I mainly just swapped out all of the uses of rustc-hash with a combination of hashbrown and ahash and replaced any lingering uses of std with core and alloc. I also replaced any use of floating-point arithmetic in the instrinics code with the core_math crate, which just implements std-compatible arithmetic operators on core floats using libm. There may be some security concerns with regards to the use of ahash in a no_std environment but I have decided that taking a deeper look into that is Tomorrow Marceline's problem.

The main kicker with cranelift-codegen right now is that so far, I've only gotten all-arch and trace-log features, with no others, to compile, and only in nightly. The reason it has to be in nightly for now is because there are some cases of impl std::error::Error that I'm too afraid to cfg out, so for now I'm using nightly and core::error::Error until I can figure out what to do with potential no_std API users of cranelift-codegen. I figure that the already-discussed use of thiserror on no_std would force us to use nightly for now anyways.

That's about where I am now. The CI mostly passes, although Cranelift's unit tests fail with slightly different code output compared to the expected code. I can chalk that up to either an impatient use of cargo update that may have minorly changed some semantics of dependencies involved in codegen, or something related to the switching of rustc-hash to hashbrown. Both are easy to test.

The main issue now is that, by default, cranelift-codegen is configured to have std,unwind,trace-log enabled. trace-log works in no_std independently of std, but unwind is still blocked by that pesky gimli/write dep and its use of std::io. I can take a deeper look into how exactly that is used and what an alternative to that either in Cranelift or upstream Gimli might look like when I have more time and can afford to lock in on specific issues like that. I'm becoming a lot more comfortable navigating the codebase but there's still a lot of cognitive overhead and even swapping out dependencies takes a lot of attention.

bjorn3 commented 3 months ago

I should also point out that the set_thread_profiler() function is not used anywhere in the codebase.

It is used by cg_clif to include performance timings collected by Cranelift in rustc self profile profiles. Having thr timing module be a nop for no_std builds is fine.

bjorn3 commented 3 months ago

I added a field to CodegenOptions in cranelift-isle to emit no_std-compatible code by adding a short prelude to generated source files

Unconditionally using core instead of std should work fine. In std builds, core is still available.

alexcrichton commented 3 months ago

For now, I just put no-op stubs in the timing API when the timing feature is not enabled and made the timing feature enable the std feature automatically

Sounds good! I'd recommend gating set_thread_profiler behind this Cargo feature entirely (and having the feature enabled by default probably too). That would avoid the confusion of calling set_thread_profiler and having it do nothing.

The one exception was cranelift-control, which to compile to no_std I slapped in a Git dep for a pending PR on arbitrary with no_std support: https://github.com/rust-fuzz/arbitrary/pull/177

I'd probably recommend adding a fuzz feature or similar to avoid needing to update the arbitrary crate (although updating that is definitely ok too). Basically gating this functionality on a Cargo feature is fine. (this Cargo feature may even be able to be off-by-default too)

I also replaced any use of floating-point arithmetic in the instrinics code with the core_math crate, which just implements std-compatible arithmetic operators on core floats using libm. There may be some security concerns with regards to the use of ahash in a no_std environment but I have decided that taking a deeper look into that is Tomorrow Marceline's problem.

This sounds generally fine and possible for us to land. When making PRs if you wouldn't mind leaving cranelift-codegen's no_std support for a single standalone PR that'll make this easiest to review and discuss.

The main kicker with cranelift-codegen right now is that so far, I've only gotten all-arch and trace-log features

FWIW I think it's definitely ok to not get the entire crate and all its features compiling with no_std. Only getting a subset working (albeit we still need a useful subset, but the one you have is useful) is more than ok. Can always work to improve the subset supported later on.

The reason it has to be in nightly for now is because there are some cases of impl std::error::Error

I'm assuming that you're probably going to add a std feature to the cranelift-codegen crate no matter what, and would it be possible to gate these impls on the std feature? That is, "just" omit them for a no_std build?

The main issue now is that, by default, cranelift-codegen is configured to have std,unwind,trace-log enabled

In the spirit of ratcheting and progress over time I'll reiterate it's completely ok to get only some features working and not others. In this workspace most dependencies disable default features and then individual crates will enable features in addition to specifying workspace = true. That would enable a state where cranelift-codegen itself builds with no_std but wasmtime-cranelift doesn't work just yet. That'd help figure out all the dependencies as we go along and get to a working state at least for one crate while the blockers of the next crate are tackled.

One example here is that the unwind feature is only used in Wasmtime for Config::native_unwind_info which would be easy to gate behind a Cargo feature. In that sense there's no need to unconditionally enable unwind in Cranelift given possible future changes.

If your target is wasmtime-cranelift the bigger problem is probably going to be object::write as that's pretty intrinsically depended on. It's probably best to cross that bridge when we get there though and it's the "only" remaining issue. If your target is just cranelift-codegen on no_std then you can disregard this last part from me :)

marceline-cramer commented 3 months ago

I added a field to CodegenOptions in cranelift-isle to emit no_std-compatible code by adding a short prelude to generated source files

Unconditionally using core instead of std should work fine. In std builds, core is still available.

It would work fine if not for the impl Length for Vec in the codegen. In no_std, that would have to be replaced with the alloc crate, and no_std or not, the alloc crate has to be manually declared extern crate alloc;.

marceline-cramer commented 3 months ago

Sounds good! I'd recommend gating set_thread_profiler behind this Cargo feature entirely (and having the feature enabled by default probably too). That would avoid the confusion of calling set_thread_profiler and having it do nothing.

Sounds great, I'll remember that when I open the cranelift-codegen no_std PR.

I'd probably recommend adding a fuzz feature or similar to avoid needing to update the arbitrary crate (although updating that is definitely ok too). Basically gating this functionality on a Cargo feature is fine. (this Cargo feature may even be able to be off-by-default too)

Not including the arbitrary crate breaks ControlPlane::arbitrary() and ControlPlane::get_arbitrary(), so I've made it on by default so that none of the dependencies need to manually enable it. We can discuss that more in the cranelift-control PR I'm about to open. :)

This sounds generally fine and possible for us to land. When making PRs if you wouldn't mind leaving cranelift-codegen's no_std support for a single standalone PR that'll make this easiest to review and discuss.

FWIW I think it's definitely ok to not get the entire crate and all its features compiling with no_std. Only getting a subset working (albeit we still need a useful subset, but the one you have is useful) is more than ok. Can always work to improve the subset supported later on.

I'm assuming that you're probably going to add a std feature to the cranelift-codegen crate no matter what, and would it be possible to gate these impls on the std feature? That is, "just" omit them for a no_std build?

Yeah, I suppose that is definitely the easiest way of going about it. I would definitely be glad to have just stripped-down cranelift-codegen working in no_std upstream for now.

One example here is that the unwind feature is only used in Wasmtime for Config::native_unwind_info which would be easy to gate behind a Cargo feature. In that sense there's no need to unconditionally enable unwind in Cranelift given possible future changes.

If your target is wasmtime-cranelift the bigger problem is probably going to be object::write as that's pretty intrinsically depended on. It's probably best to cross that bridge when we get there though and it's the "only" remaining issue. If your target is just cranelift-codegen on no_std then you can disregard this last part from me :)

I was not aware that the dependency on unwind was so light, glad to hear! Yes, I am most interested in using the entirety of Wasmtime JIT functionality itself in a no_std environment. Once I can get the lower-level features working though I'll see what I can get done with the object crate and I'll come back here to make the rest of the push. :)

bytecodealliance / wasmtime

cranelift-codegen no longer builds with no_std #1158