project-oak / oak

Meaningful control of data in distributed systems.
Apache License 2.0
1.32k stars 113 forks source link

Add `tonic` as a crate in `cargo-raze` #890

Closed ipetr0v closed 4 years ago

ipetr0v commented 4 years ago

In order to finish https://github.com/project-oak/oak/issues/806 we need to add tonic as a dependency using cargo-raze.

cc @tiziano88 @daviddrysdale

ipetr0v commented 4 years ago

First, there were the following errors:

error: couldn't read external/raze__ring__0_16_12/src/ec/curve25519/ed25519/ed25519_pkcs8_v2_template.der: No such file or directory (os error 2)
   --> external/raze__ring__0_16_12/src/ec/curve25519/ed25519/signing.rs:266:12
    |
266 |     bytes: include_bytes!("ed25519_pkcs8_v2_template.der"),
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: couldn't read external/raze__webpki__0_21_2/src/data/alg-ecdsa-p256.der: No such file or directory (os error 2)
   --> external/raze__webpki__0_21_2/src/signed_data.rs:292:43
    |
292 |     asn1_id_value: untrusted::Input::from(include_bytes!("data/alg-ecdsa-p256.der")),
    |                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The reason was that cargo-raze does not automatically add non-*.rs files to the Bazel workspace.

The problem was solved by adding the following lines to cargo/Cargo.toml:

[raze.crates.ring.'0.16.12']
data_attr = "glob([\"**/*.der\"])"

[raze.crates.webpki.'0.21.2']
data_attr = "glob([\"**/*.der\"])"

They add data attributes to rust_library rules in the generated BUILD files.

ipetr0v commented 4 years ago

Currently there are 2 types of failures:

Use --sandbox_debug to see verbose messages from the sandbox
error[E0432]: unresolved import `prost1`
 --> external/raze__tonic__0_2_0/src/codec/prost.rs:4:5
  |
4 | use prost1::Message;
  |     ^^^^^^ use of undeclared type or module `prost1`
error[E0599]: no method named `next` found for struct `std::pin::Pin<&mut _>` in the current scope
  --> external/raze__tonic__0_2_0/src/codec/encode.rs:47:5
   |
47 | /     async_stream::stream! {
48 | |         let mut buf = BytesMut::with_capacity(BUFFER_SIZE);
49 | |         futures_util::pin_mut!(source);
50 | |
...  |
74 | |         }
75 | |     }
   | |_____^ method not found in `std::pin::Pin<&mut _>`
   |
   = note: `$ crate :: AsyncStreamHack` is a function, perhaps you wish to call it
   = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
ipetr0v commented 4 years ago

The first error is caused by an alias for prost library, defined in tonic:

[features]
prost = ["prost1", "prost-derive"]

[dependencies]
prost1 = { package = "prost", version = "0.6", optional = true }
prost-derive = { version = "0.6", optional = true }

https://github.com/hyperium/tonic/blob/6f378e2bd0cdf3a1a3df87e1feff842a8a599142/tonic/Cargo.toml#L38 https://github.com/hyperium/tonic/blob/6f378e2bd0cdf3a1a3df87e1feff842a8a599142/tonic/Cargo.toml#L60

I have updated Bazel rules_rust library to the latest version (that should support crate aliases: https://github.com/bazelbuild/rules_rust/pull/285):

http_archive(
    name = "io_bazel_rules_rust",
    sha256 = "275f0166e61e6cad3e29b0e37c21ecbb66880c049dbeea6e574d74a8ec4775c5",
    strip_prefix = "rules_rust-e285f2bd8be77712e6b80ccb52918b727d10d70e",
    urls = [
        # Master branch as of 2020-04-21.
        "https://github.com/bazelbuild/rules_rust/archive/e285f2bd8be77712e6b80ccb52918b727d10d70e.tar.gz",
    ],
)

But there is no way to automatically generate an alias attribute to rust_library, because the current version of cargo-raze does not support crate aliases yet: https://github.com/google/cargo-raze/pull/123.

Even if I manually add an alias to the generated BUILD file:

aliases = {
    ":prost1": "prost",
}

But it leads to the following error:

(10:55:31) ERROR: /opt/my-project/bazel-cache/clang/external/raze__tonic__0_2_0/BUILD.bazel:28:1: in aliases attribute of rust_library rule @raze__tonic__0_2_0//:tonic: rule '@raze__tonic__0_2_0//:prost1' does not exist
ipetr0v commented 4 years ago

Looks like we only can use an old version of tonic:

tonic = { version = "0.1.1", features = ["tls"] }

List of supported versions is here: https://github.com/google/cargo-raze/issues/41#issuecomment-592274128

tiziano88 commented 4 years ago

That's annoying but I guess we can live with it. @ipetr0v does it require any changes to our own code, to use that old version? Also, is there a path forward at some point, or are we stuck with it until someone else figures out what the problem is and how to solve it?

ipetr0v commented 4 years ago

The old version of tonic compiled with the new version of gRPC pseudo-Node (without cargo-raze), so I think we can use it until we will completely move to Rust and will stop using cargo-raze.

ipetr0v commented 4 years ago

Currently there are different errors generated by cargo-raze:

error[E0425]: cannot find value `server` in this scope
  --> external/raze__tonic__0_1_1/src/transport/server/incoming.rs:28:5
   |
28 | /     async_stream::try_stream! {
29 | |         futures_util::pin_mut!(incoming);
30 | |
31 | |         while let Some(stream) = incoming.try_next().await? {
...  |
48 | |         }
49 | |     }
   | |_____^ not found in this scope
   |
   = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)

error[E0599]: no method named `next` found for struct `std::pin::Pin<&mut _>` in the current scope
  --> external/raze__tonic__0_1_1/src/codec/encode.rs:47:5
   |
47 | /     async_stream::stream! {
48 | |         let mut buf = BytesMut::with_capacity(BUFFER_SIZE);
49 | |         futures_util::pin_mut!(source);
50 | |
...  |
74 | |         }
75 | |     }
   | |_____^ method not found in `std::pin::Pin<&mut _>`
   |
   = note: `$ crate :: AsyncStreamHack` is a function, perhaps you wish to call it
   = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
tiziano88 commented 4 years ago

I'm not sure if / when we will be able to stop using cargo-raze entirely, given we have some pseudo-nodes implemented in C++ (e.g. roughtime) and we may have more in the future.

Currently there are different errors generated by cargo-raze:

Do you mean there are still errors even when using the old version?

ipetr0v commented 4 years ago

Do you mean there are still errors even when using the old version?

Yes, gRPC pseudo-node compiles successfully with an old tonic version using cargo build and without Bazel, but (cd cargo; cargo raze); ./scripts/build_server -s base gives these errors.

Looks like we need to downgrade other packages too, but it's not obvious which ones.

ipetr0v commented 4 years ago

So after adding the following flag to the generated tonic-0.1.1.BUILD file:

rustc_flags = [
    "-Zexternal-macro-backtrace",
],

It showed the following error:

1  |     / ($ ($ body : tt) *) =>
2  |     | {
3  |     |     {
4  |     |         let (mut __yield_tx, __yield_rx) = $ crate :: yielder :: pair () ; $
...      |
8  |     |              # [derive ($ crate :: AsyncStreamHack)] enum Dummy
   |     |                         ^^^^^^^^^^^^^^^^^^^^^^^^^^
   |     |                         |
   |     |                         not found in this scope
   |     |                         in this expansion of `stream_2!` (#59)
9  |     |              { Value = $ crate :: scrub ! { $ ($ body) * } } $ crate ::
   |  ___|______________________________________________________________-
10 | |   |              dispatch ! (($ ($ body) *))
   | |___|________________________________________- in this macro invocation (#2)
11 |     |          })
12 |     |     }
13 |     | }
   |     |_- in this expansion of `async_stream::stream!` (#1)
   | 
  ::: <::async_stream::dispatch macros>:1:2

And this error is originated from the following macro: https://github.com/tokio-rs/async-stream/blob/91f6a380cbc1181f2de5bcd1e4a369c03f1c8277/async-stream/src/lib.rs#L266

Plus it also showed a couple of very strange errors:

1  |  /  () => { stream_0 ! () } ; (!) => { stream_1 ! () } ; (! !) =>
2  |  |  { stream_2 ! () } ; (! ! !) => { stream_3 ! () } ; (! ! ! !) =>
   |  |    ------------- in this macro invocation (#59)
3  |  |  { stream_4 ! () } ; (! ! ! ! !) => { stream_5 ! () } ; (! ! ! ! ! !) =>
4  |  |  { stream_6 ! () } ; (! ! ! ! ! ! !) => { stream_7 ! () } ; (! ! ! ! ! ! ! !)
...   |
95 |  |  (! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
96 |  |   ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !) => { stream_64 ! () } ;
   |  |____________________________________________________________________________________________________________________________________- in this expansion of `$crate::count!` (#58)
   | 
  ::: external/raze__tonic__0_1_1/src/codec/encode.rs:47:

The problem was solved by adding "--cfg=wrap_proc_macro" to the cargo/Cargo.toml file:

[raze.crates.proc-macro2.'1.0.10']
additional_flags = [
    "--cfg=use_proc_macro",
    "--cfg=wrap_proc_macro",
]
ipetr0v commented 4 years ago

A new problem appears during the link stage (after compiling ring library, that is a dependency for rustls):

ERROR: /opt/my-project/oak/server/loader/BUILD:46:1: Linking of rule '//oak/server/loader:oak_runner' failed (Exit 1) clang.sh failed: error executing command toolchain/clang.sh -o bazel-out/k8-fastbuild/bin/oak/server/loader/oak_runner bazel-out/k8-fastbuild/bin/oak/server/loader/_objs/oak_runner/oak_runner_main.o ... (remaining 209 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
ld.lld: error: undefined symbol: GFp_cpuid_setup
>>> referenced by ring.4gykf9i3-cgu.1
>>>               ring-964434662.ring.4gykf9i3-cgu.1.rcgu.o:(ring::cpu::features::_$u7b$$u7b$closure$u7d$$u7d$::hd913c5363db0de84) in archive bazel-out/k8-fastbuild-ST-5e74b77704d3a70b08875590eb0f067cbb9a6e09f41f090f307cf0d79d4b2461/bin/oak/server/rust/oak_glue/liboak_glue-870793972.a

ld.lld: error: undefined symbol: GFp_bn_neg_inv_mod_r_u64
>>> referenced by ring.4gykf9i3-cgu.0
>>>               ring-964434662.ring.4gykf9i3-cgu.0.rcgu.o:(ring::arithmetic::bigint::Modulus$LT$M$GT$::from_boxed_limbs::h663043e078afe730) in archive bazel-out/k8-fastbuild-ST-5e74b77704d3a70b08875590eb0f067cbb9a6e09f41f090f307cf0d79d4b2461/bin/oak/server/rust/oak_glue/liboak_glue-870793972.a

Looks like cargo raze doesn't link C++ code with Rust that runs it via FFI: https://github.com/briansmith/ring/blob/4c392ad338f61ea166a29c83d4208e8edfecc6ca/src/cpu.rs#L43-L48

ipetr0v commented 4 years ago

The C code is being compiled by build.rs. cargo raze can generate a genrule for running build.rs files via gen_buildrs = trueflag:

[raze.crates.ring.'0.16.12']
gen_buildrs = true
data_attr = "glob([\"**/*.der\"])"
ipetr0v commented 4 years ago

After adding this flag a new error appeared:

ERROR: /opt/my-project/bazel-cache/clang/external/raze__ring__0_16_12/BUILD.bazel:47:1: Executing genrule @raze__ring__0_16_12//:ring_build_script_executor failed (Exit 101)
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 17, kind: AlreadyExists, message: "File exists" }', external/raze__ring__0_16_12/build.rs:301:5
stack backtrace:

This problem is caused by the fact, that ring creates a directory for generated .asm files and fails, if it already exists (which is true in case of Bazel since everything is saves in bazel-cache): https://github.com/briansmith/ring/blob/92f936bc3b76163ef49baa6b9593811e7ddfc4c0/build.rs#L306

ipetr0v commented 4 years ago

After adding a manual removal of the pregenerated directory to ring-0.16.12.BUILD:

genrule(
    name = "ring_build_script_executor",
    ...
    cmd =
        ...
        + " cd $$(dirname $(location :Cargo.toml)) && rm -r pregenerated && $$BINARY_PATH && tar -czf $$OUT_TAR -C $$OUT_DIR .)"
)

The following problem showed up:

ERROR: /opt/my-project/bazel-cache/clang/external/raze__ring__0_16_12/BUILD.bazel:47:1: Executing genrule @raze__ring__0_16_12//:ring_build_script_executor failed (Exit 101)
thread 'main' panicked at 'failed to execute ["yasm.exe" "-X" "vc" "--dformat=cv8" "--oformat=win64" "--machine=amd64" "-o" "pregenerated/aes-x86_64-nasm.obj" "pregenerated/tmp/aes-x86_64-nasm.asm"]: No such file or directory (os error 2)', external/raze__ring__0_16_12/build.rs:638:9
...
running "yasm.exe" "-X" "vc" "--dformat=cv8" "--oformat=win64" "--machine=amd64" "-o" "pregenerated/aes-x86_64-nasm.obj" "pregenerated/tmp/aes-x86_64-nasm.asm"

The problem is that build.rs is trying to generate .asm files for all possible architectures: https://github.com/briansmith/ring/blob/92f936bc3b76163ef49baa6b9593811e7ddfc4c0/build.rs#L310 And ends up running yasm.exe on linux.

ipetr0v commented 4 years ago

So it looks like .asm files are not supposed to be generated via cargo build, since pregenerate_asm_main doesn't run if CARGO_PKG_NAME is set to ring: https://github.com/briansmith/ring/blob/92f936bc3b76163ef49baa6b9593811e7ddfc4c0/build.rs#L256 But cargo-raze doesn't initialize this environment variable (and a lot of other variables too). So this problem was solved by adding the following list of environment variables to the genrule:

genrule(
    name = "ring_build_script_executor",
    ...
    cmd =
        ...
        + " export CARGO_PKG_NAME=ring;"
        + " export CARGO_CFG_TARGET_ARCH=x86_64;"
        + " export CARGO_CFG_TARGET_OS=linux;"
        + " export CARGO_CFG_TARGET_ENV=musl;"
        + " export OPT_LEVEL=3;"
        + " export PROFILE=release;"
        + " export DEBUG=false;"
        + " export HOST=host;"
]
ipetr0v commented 4 years ago

After adding these environment variables, build.rs was successfully run, but didn't solve the initial link problem:

ld.lld: error: undefined symbol: GFp_cpuid_setup
>>> referenced by ring.4gykf9i3-cgu.1
>>>               ring-964434662.ring.4gykf9i3-cgu.1.rcgu.o:(ring::cpu::features::_$u7b$$u7b$closure$u7d$$u7d$::hd913c5363db0de84) in archive bazel-out/k8-fastbuild-ST-5e74b77704d3a70b08875590eb0f067cbb9a6e09f41f090f307cf0d79d4b2461/bin/oak/server/rust/oak_glue/liboak_glue-870793972.a
ipetr0v commented 4 years ago

Looks like ring requires a static ring-core library that was compiles from C sources. But cargo raze doesn't provide it with a directory containing an .a library file.

_: /opt/my-project/bazel-cache/clang/execroot/oak/bazel-out/host/bin/external/raze__ring__0_16_12/ring_build_script
cargo:rustc-link-lib=static=ring-core
cargo:rustc-link-lib=static=ring-test
cargo:rustc-link-search=native=/opt/my-project/bazel-cache/clang/execroot/oak/bazel-out/k8-fastbuild-ST-5e74b77704d3a70b08875590eb0f067cbb9a6e09f41f090f307cf0d79d4b2461/bin/external/raze__ring__0_16_12/ring_out_dir_outputs
ipetr0v commented 4 years ago

In order to fix this we need to add two additional flags ("-lstatic=ring-core" and "-Lnative=STATIC_LIB_DIR") to rustc_flags (https://doc.rust-lang.org/rustc/command-line-arguments.html#-l-add-a-directory-to-the-library-search-path).

The problem is that by default .a file is generated in the ./bazel-cache/clang/execroot/oak/bazel-out/k8-fastbuild-ST-5e74b77704d3a70b08875590eb0f067cbb9a6e09f41f090f307cf0d79d4b2461/bin/external/raze__ring__0_16_12/ring_out_dir_outputs, and this directory may change in different environments. So we need to make it use an execroot/oak directory, and thus change the following lines from this:

genrule(
    name = "ring_build_script_executor",
    cmd = "mkdir -p $$(dirname $@)/ring_out_dir_outputs/;"
        ...
        # + " export OUT_DIR=$$PWD/$$(dirname $@)/ring_out_dir_outputs;"

to this:

genrule(
    name = "ring_build_script_executor",
    cmd = "mkdir -p ring_out_dir_outputs/;"
        ...
        + " export OUT_DIR=$$PWD/ring_out_dir_outputs;"

In order to refer to this directory from a sandbox we need to add the following relative path as a native library path:

rust_library(
    name = "ring",
    ...
    rustc_flags = [
        ...
        "-lstatic=ring-core",
        "-Lnative=../../../../../execroot/oak/ring_out_dir_outputs/",
    ],
)
ppodolsky commented 4 years ago

Folks, I didn't even know what is project oak before. Just was googlin around how to build tonic with Bazel. And you know, I would like to give thanks @ipetr0v. These detailed explanations are awesome, it is a rare example of wellcrafted GitHub issue giving its fruits even after close.