w3f / schnorrkel

Schnorr VRFs and signatures on the Ristretto group
BSD 3-Clause "New" or "Revised" License
310 stars 93 forks source link

no_std is broken, doesn't play with sgx #31

Open brenzi opened 5 years ago

brenzi commented 5 years ago

I'm trying to use this lib within sgx using

https://github.com/baidu/rust-sgx-sdk

It seems that #[no_std] is broken as std gets pulled in in places (like context.rs). Also alloc is not imported

@elichai commented the following: https://github.com/baidu/rust-sgx-sdk/issues/93#issuecomment-487576736

I'll need schnorrkel to support sgx for substraTEE

brenzi commented 5 years ago

When I build this crate with

cargo build --release --no-default-features --features "u64_backend alloc"

errors explode.

the low hanging fruit:

diff --git a/src/context.rs b/src/context.rs
index 44d9ab3..ec56bc1 100644
--- a/src/context.rs
+++ b/src/context.rs
@@ -9,7 +9,7 @@

 //! ### Schnorr signature contexts and configuration, adaptable to most Schnorr signature schemes.

-use std::cell::RefCell;
+use core::cell::RefCell;

 use rand::prelude::*;  // {RngCore,thread_rng};

But that doesn't do the trick by far

burdges commented 5 years ago

I've pushed a fix that passes your build command. It suffices? Thanks!

I'll publish a new version soon, but we've several compatibility breaks already in master and..

I'd love to resolve https://github.com/w3f/schnorrkel/issues/9 along with the other compatibility breaks, but so far nothing conclusive arose. I'll make a decision there asap. :)

burdges commented 5 years ago

Appears ed25519-dalek lacks this extern crate alloc; that I needed, so not sure if I fixed those errors correctly or not.

brenzi commented 5 years ago

Cheers for your attempt to fix this issue.

I now get

error[E0658]: use of unstable library feature 'alloc': this library is unlikely to be stabilized in its current form or name (see issue #27783)
   --> /home/abrenzikofer/.cargo/git/checkouts/schnorrkel-0d699dd19a5d19bc/bc1cf8e/src/lib.rs:262:1
    |
262 | extern crate alloc;
    | ^^^^^^^^^^^^^^^^^^^
    |
    = help: add #![feature(alloc)] to the crate attributes to enable

error: aborting due to previous error

So I must guess this issue isn't fixed just yet - not only because of upstream crate ed25519-dalek

brenzi commented 5 years ago

edit: sorry, this issue should be fixed by updating rust nightly

There are further issues when building without alloc feature:

> cargo build --release --no-default-features --features "u64_backend"
error[E0432]: unresolved import `crate::sign::verify_batch`
   --> src/lib.rs:308:50
    |
308 | pub use crate::sign::{Signature,SIGNATURE_LENGTH,verify_batch};
    |                                                  ^^^^^^^^^^^^ no `verify_batch` in `sign`

error[E0433]: failed to resolve: use of undeclared type or module `BTreeMap`
   --> src/musig.rs:373:22
    |
373 |         let mut Rs = BTreeMap::new();
    |                      ^^^^^^^^ use of undeclared type or module `BTreeMap`

...
brenzi commented 5 years ago

Doesn't build with rustsgx-sdk. Compiler still complains:

error: duplicate lang item in crate `std`: `f32_runtime`.
  |
  = note: first defined in crate `sgx_tstd`.

error: duplicate lang item in crate `std`: `f64_runtime`.
  |
  = note: first defined in crate `sgx_tstd`.

error: duplicate lang item in crate `std`: `panic_impl`.
  |
  = note: first defined in crate `sgx_tstd`.

error: duplicate lang item in crate `std`: `oom`.
  |
  = note: first defined in crate `sgx_trts`.

It looks like one or more of your dependencies don't support no_std: merlin ?

burdges commented 5 years ago

If I understand, the #![feature(alloc)] error comes form your rust being outdated, yes?

I've fixed the issues I understood and pushed the changes, in part by omitting musig without alloc or std. I even made cargo test .. work with various combinations, which eve dalek does not do.

I still do not understand why ed25519-dalek does not require extern crate alloc; like schnorrkel seemingly does. An edition setting in Cargo.toml maybe?

I also do not understand the no_std: merlin issue. I know merlin's tests use strobe-rs, which requires Vec, but outside tests I'm unsure about the problem. Can you pin down the actual problem?

burdges commented 5 years ago

There is a serious dependency flaw caused by the rand crate being chopped up into too many poinless subcrates: https://github.com/rust-random/rand/issues/738 Is that the problem? I'm afraid merlin does create an Rng, but some workaround exist I think.

burdges commented 5 years ago

Or maybe my Cargo.toml does not pass through some feature flags correctly?

brenzi commented 5 years ago

right now we're trying to use ed25519 signatures with our node as a workaround. I'll revert when I find time for further debugging

burdges commented 5 years ago

I've found the problem maybe.. merlin lacks any support for no_std, like you said.

I believe https://github.com/dalek-cryptography/merlin/blob/master/src/strobe.rs#L4 might be the only problematic line in merlin. Can you tell if using this branch of merlin works? It merely uses use core::ops::{Deref, DerefMut}; and passes through the std feature.

burdges commented 5 years ago

It appears my second commit there to Cargo.toml breaks their CI, but the first seemingly passed fine. It's maybe not required to pass through the std dependency unless you really use functionality only present with std I guess?

brenzi commented 5 years ago

I just tried to use your merlin branch within sgx. no success. Same duplicate lang item error as above

toml:

merlin = { git ="https://github.com/burdges/merlin.git", branch="no_std", default-features = false}

code:

extern crate merlin;
use merlin::{Transcript};
burdges commented 5 years ago

It's likely the clear_on_drop crate https://github.com/dalek-cryptography/curve25519-dalek/pull/236#issuecomment-475907224 https://github.com/cesarb/clear_on_drop/issues/20 but that'll likely impact ed25519-dalek too.

We should switch from clear_on_drop to zeroize whenever curve25519-dalek merges https://github.com/dalek-cryptography/curve25519-dalek/pull/236

We probably could not publish a schnorrkel version in which no std required a curve25519-dalek branch, but your build can work fine from branches I suppose? It's also possible to temporarily disable zeroing when building for WASM by defining dummy ClearOnDrop trait when required.

We could investigate the rand micro-crate bug further I mentioned above https://github.com/w3f/schnorrkel/issues/31#issuecomment-488137199 this does not fix it.

burdges commented 5 years ago

I pushed a temporary fix in d5a48350f6916d4fc766b7190b3e616ec84f50c2 but not sure if it helps you really. We'll switch to the Zeroize crate whenever it works, so I'll leave this open.

brenzi commented 5 years ago

Doesn't work for me: error: duplicate lang item in crate std We're working with an ed25519 chain now, so no hurry. We're fine for now.

burdges commented 5 years ago

I cannot reproduce that error myself, but it sounds like something is importing std and core separately or something.

isislovecruft commented 5 years ago

If you're okay with using nightly compilers only, I have a merlin branch that might work for you: https://github.com/dalek-cryptography/merlin/pull/42

burdges commented 5 years ago

Appears we lacked many default_features = false lines here too. Any idea if d5a4835 helps @brenzi ? I want to revert it if possible.

brenzi commented 5 years ago

@burdges thanks for the update. I won't be able to test it until next week.

burdges commented 5 years ago

We moved to the zeroize crate's trait with a manual impl, due to proc macros not working on both nightly and stable. I'm afraid clear_on_drop remains a curve25519-dalek dependency, but maybe this reduces conflicts if another crate imported clear_on_drop differently.

burdges commented 5 years ago

Right now cargo build --release --no-default-features --features "u64_backend alloc" works, but..

Afaik, there is no way ::rand::thread_rng() should compile without std due to https://docs.rs/rand/0.6.5/src/rand/rngs/mod.rs.html#156

Any ideas why this works for me @dhardy ? I should actually supply some build environment without an OsRng perhaps?

burdges commented 5 years ago

I reduced the dependence on rand to just rand_core in https://github.com/w3f/schnorrkel/commit/21a039e38d91744ae67845e4fe0e8f91a07c4df3 but with a rand feature, which seemingly reduces dependence on std dramatically. It'll use rand_os without the rand feature, which gets slower, but works.

I still do not understood how thread_rng exists without std, but currently the default randomness source without std is PanicRng which simply panics: https://github.com/w3f/schnorrkel/commit/21a039e38d91744ae67845e4fe0e8f91a07c4df3#diff-b4aea3e418ccdb71239b96952d9cddb6R244

It's possible the rand_os dependency should be optional too, but we should understand how rand really works in no_std first.

chesterliliang commented 5 years ago

Hi Burdges.

When we try to compile schnorrkel to target thumbv7m-none-eabi which is for ARM cortex M3, we found there are some issues generated by 'rand'. Most of them are caused by usage of thread_rng(), but some of them are hard to tell because errors changing on different version of rand.

We have a suggestion that considering schnorrkel may porting to different platform including MCU for IOT devices or hardware wallet, a branch which let users choose the random source by themselves and transfer it into our code. Here are benefits if we make this branch:

  1. Make the master branch need not to take care this kind of issues. The difference could be separated by features if branch merged like
  #[cfg(feature = "ext_rng")]
    pub fn sign_trng<T: SigningTranscript>(&self, mut t: T, trng: &[u8], public_key: &PublicKey) -> Signature 
    {
        t.proto_name(b"Schnorr-sig");
        t.commit_point(b"pk\x00",public_key.as_compressed());

        let r = t.witness_scalar_trng(b"signing\x00",&[&self.nonce],trng);  // context, message, A/public_key
        let R = (&r * &constants::RISTRETTO_BASEPOINT_TABLE).compress();

         t.commit_point(b"no\x00",&R);
        let k: Scalar = t.challenge_scalar(b"sign\x00");  // context, message, A/public_key, R=rG
        let s: Scalar = &(&k * &self.key) + &r;

        Signature{ R, s }
    }
  1. Generate the pseudo-random number will cost much for little MCU system, not only the time, but also need to consider if the entropy source is good enough or even exsit. Since many chips don't run any OS at all in IOT world and they don't have good RTC schematic or even dot not have anything about time which is main entropy source of pseudo-random number. So it is not just the no -std problems in RUST. However, the hardware system may have other good entropy source such as noise. And they could use it by accessing registers in serval lines of code:
void GetTRNG(uint16_t len, uint8_t * dataOut)
{
...

 RCC_SEC1Periph_ClockCmd(RCC_SEC1Periph_TRNG,ENABLE);

 while(wordLen--)
 {
  RNGCLR = 0x01;

  EVINT_MCUEV_Cmd(TRNG_IRQn,ENABLE);
  RNGCON = 0x01;  //start TRNG
  while(!(RNGSTS & 0x01))
  {     
   __WFE();
  }
  EVINT_MCUEV_Cmd(TRNG_IRQn,DISABLE);

  *(uint32_t *)(dataOut) = RNGDAT;
  dataOut += 4;
 }

 ...
 RNGSTOP = 0x01; //stop RNG

 RCC_SEC1Periph_ClockCmd(RCC_SEC1Periph_TRNG,DISABLE);
}
  1. So it is convenient for the embedded developer to choose their way to generate RNG if we have this branch. And moreover, if we fix the RNG generate methond in schnorrkel but our depends works not well when porting to other platform, it make rise the security risks for the users which may beyond our contrl.

We did some test in our fork wookong-dot-schnorrkel. An embedded folder added with cortex-rust project which could run QEMU, ARM simulator. You could try it with the steps in README.md

burdges commented 5 years ago

I fixed 1 in https://github.com/w3f/schnorrkel/commit/21a039e38d91744ae67845e4fe0e8f91a07c4df3 I think. Infact, yYou could always replace thread_rng by calling attach_rng, assuming your system has good randomness, but I removed the dependency in that commit.

In principle, you could create a wrapper type pub struct TranscriptWithMyRng(::merlin::Transcript);, delegate required methods of SigningTranscript, and make witness_bytes call your good system CSPRNG. attach_rng is easier if you only do this occasionally though.

I think 2 sounds out of scope, not sure I understand 3 yet. There is considerable work on tools like jitter_rng, etc. from the rand crate developers, but mostly they gave up on creating their own randomness. I think the getrandom crate provide an OS interface used by rand_os.

chesterliliang commented 5 years ago

21a039e works with cargo build --release --no-default-features --features "u64_backend alloc" but fails with cargo build --release --target thumbv7m-none-eabi --no-default-features
or cargo build --release --target thumbv7m-none-eabi --no-default-feature --features "u64_backend alloc"

ChestersdeMacBook-Pro:schnorrkel chester$ cargo build --release --target thumbv7m-none-eabi --no-default-features
   Compiling typenum v1.10.0
   Compiling cc v1.0.38
   Compiling byteorder v1.3.2
   Compiling rand_core v0.4.0
   Compiling byte-tools v0.3.1
   Compiling subtle v2.1.1
   Compiling fake-simd v0.1.2
   Compiling opaque-debug v0.2.3
error[E0463]: can't find crate for `std`
  |
  = note: the `thumbv7m-none-eabi` target may not be installed

error: aborting due to previous error

For more information about this error, try `rustc --explain E0463`.
error: Could not compile `rand_core`.
warning: build failed, waiting for other jobs to finish...
error: build failed
dhardy commented 5 years ago

Sorry @burdges I don't know the answer to that question.

burdges commented 5 years ago

Thanks anyways @dhardy :)

It's opaque-debug that fails due to some no_std between it and that platform? It's seemingly used by sha2 and sha3. Can you build sha2 all by itself? If not ask @newpavlov

I sadly cannot purge sha2 from this crate, due to kusama using the ExpansionMode::Ed25519 option. I could however feature gate sha2 so that you can avoid using it if you never transfer secrets with kusama's subkey, etc. And we'll might dump ExpansionMode::Ed25519 in polkadot. Also, you could fork schnorrkel using some sha2 crate without the complex dependencies.

Just fyi, we cannot purge the digest crate because curve25519-dalek depends upon it, but we do not depend upon digest for anything important, so again digest could be avoided in some fork of both curve25519-dalek and schnorrkel. We'd then have dependencies down to exclusive dalek ecosystem and lower level rand crates.

newpavlov commented 5 years ago

Attempting to remove sha2 just because you think it breaks no_std build looks like an overkill to me. If problem indeed originates in RustCrypto crates, please, report it and I will try to fix it ASAP.

Anyway, I've just re-run no_std CI test for RustCrypto/hashes and it has passed without any issues. I highly doubt problem is in opaque_debug, as you can see here, code is dead simple and does not use std anywhere.

Also, please read the compilation error message more carefully:

error: Could not compile rand_core.

I guess, std feature gets enabled for rand_core, so probably one of the dependencies has forgot to use default-features = false and correctly cascade std feature.

burdges commented 5 years ago

Right, thank you @newpavlov ! :) I jumped on sha2, etc. because they should serve no purpose here, but I made some stupid mistakes early that resulted in their inclusion.

I tweaked the rand handling in https://github.com/w3f/schnorrkel/commit/2d4c2ef4a1ab2516ed392dc41bcbb51dfcaae86e @chesterliliang by copying the feature propagation from https://docs.rs/crate/rand/0.6.5/source/Cargo.toml but..

I do not understand the relationship between rand_os/getrandom and std, and my current scheme looks incorrect, so no promises.

Can you take a look at that commit next week? @jacogr Are the wasm, etc. features doing the right thing?

It appears merlin and curve25519-dalek require different rand_core versions, likely stubbed together, but even the stubs need their default-features = false lines.

jacogr commented 5 years ago

@burdges The WASM compilation (via the Rust wasmpack), creates externs for the random functions and then you need to provide these via the Node.js crypto module, which is effectively only getRandomBytes and randomFillsync functions.

(So it has not been an issue at all in the past with the 0.1.x up to 0.8.4 - well, different environments such as React Native do have some issues, but that is solved via polyfills for the extern crypto module, doesn't work out of the box, but with some project tweaks and filling in the gaps, it does)

burdges commented 5 years ago

Any idea how rand::thread_rng exists when substrate compiles schnorrkel no_std? Or maybe we're not no_std? If not, then I'm not worried here.

chesterliliang commented 5 years ago

Rand use prelude to re-export and I see some discussion about no_std and crate:prelude in std-using paths work just fine in 2018 edition #![no_std] and Add extern crate items to extern prelude. Not sure if it is related to our problem.

burdges commented 5 years ago

We no longer depend upon rand, only rand_core, unless you pass the rand feature.

burdges commented 5 years ago

I read https://github.com/rust-random/rand/blob/master/rand_os/Cargo.toml#L25 as activating std for rand_core whenever rand_os gets used, likely causing https://github.com/w3f/schnorrkel/issues/31#issuecomment-518519897

burdges commented 5 years ago

I think rand_os now avoids activating std in rand_core as of https://github.com/rust-random/rand/pull/859 so that's fixed.

We cannot expect merlin to update to rand_core 0.6 anytime soon because adding RngCore::from_entropy proved to be the breaking change that broke the camels back, ala https://github.com/dalek-cryptography/curve25519-dalek/pull/262 https://github.com/dalek-cryptography/curve25519-dalek/pull/263 etc.

As I said in https://github.com/dalek-cryptography/merlin/pull/47#issuecomment-520211698 we can either

Any thoughts on substrate's upgrade path for rand 0.7 @jacogr ? No rush right?

burdges commented 5 years ago

I stopped the *_backend features from activating ed25519 in 33083a9debd2a3474f4ffda430091405bc7e114e which likely contributed here, including to both @brenzi original problem and @chesterliliang more recent issue.

I explained how a wrapper shim could support merlin in https://github.com/rust-random/rand/issues/865#issuecomment-520218014 but it seemingly either needs an extra crate or else nightly for https://github.com/rust-lang/cargo/pull/4953

burdges commented 5 years ago

I've a "schizomerlin" branch https://github.com/w3f/schnorrkel/tree/schizomerlin that avoids invoking the std feature of rand_core, so test that one too.

I doubt cargo feature work on stable, so if we need schizomerlin then we might factor RngCore5As4 out into a separate crate: https://github.com/w3f/schnorrkel/commit/8150ef6333df6238d8d861aec74bf7ccd87119aa

brenzi commented 4 years ago

Re-visiting this after a while of absence.

Neither schizomerlin branch nor master work for me.

brenzi commented 4 years ago

as schnorrkel depends on ed25519-dalek, the following might be an issue: https://github.com/dalek-cryptography/ed25519-dalek/issues/95

but strange enough, my minimal example actually works fine with no_std: https://github.com/scs/test-no-std/tree/schnorrkel If you build this with

cargo build --no-default-features

it actually works. But if I use schnorrkel within my real crate, it doesn't:

Reproduce:

  1. get https://github.com/scs/substraTEE-worker/tree/schnorrkel-no-std-debug
  2. cd enclave
  3. cargo build --no-default-features should fail.

but if you remove this use statement: https://github.com/scs/substraTEE-worker/blob/57b34bfda28dcac07749e86944d1ee17b639fbc8/primitives/src/sr25519.rs#L26

it will build

burdges commented 4 years ago

I think master should avoid ed25519-dalek thanks to https://github.com/w3f/schnorrkel/commit/33083a9debd2a3474f4ffda430091405bc7e114e although I never published that version. We should try both that and the schizomerlin commit https://github.com/w3f/schnorrkel/commit/8150ef6333df6238d8d861aec74bf7ccd87119aa together.

I've now forgotten if the schizomerlin commit should suffice. If rand_os activates std then we must avoid that too. You might avoid rand_os when you invoke without default features, but another options is upgrading merlin's rand_core version.

brenzi commented 4 years ago

Schnorrkel actually works if patching ed25519-dalek to my fork:

[patch.crates-io]
ed25519-dalek = { git = "https://github.com/scs/ed25519-dalek.git", branch = "no_std_sgx"}
burdges commented 4 years ago

I'd expect merlin might break rand_core without rand_core being upgrade, or at least using the schizomerlin commit 8150ef6, but maybe not.

burdges commented 4 years ago

Is there no trouble from the failure crate here? We could drop support for it easily I think.

burdges commented 4 years ago

Is this fixed? I'll publish some new version over Xmas hopefully, well more like January, so if not then maybe then.

brenzi commented 4 years ago

We haven't tried with the unpatched ed25519-dalek. If it's fixed there, its fixed for us