bazelruby / rules_ruby

Formerly canonical rules for ruby, that are about 2-3 years behind current Bazel. If they work for you great, but if not — please try the new rules ruby by Alex Radionov: https://github.com/bazel-contrib/rules_ruby
Apache License 2.0
99 stars 37 forks source link

Documentation question regarding native extensions #78

Open sayrer opened 4 years ago

sayrer commented 4 years ago

I was initially mislead by README content that says: "Building native extensions in gems with Bazel" is not yet supported.

Is non-gem support for native extensions documented somewhere? I eventually found

https://github.com/bazelruby/rules_ruby/blob/master/ruby/tests/BUILD.bazel#L186

so it seems like it is possible to use native extensions in ruby_binary and ruby_library, but I only figured this out because the @org_ruby_lang_ruby_toolchain//:headers rule seems designed to support this use case.

kigster commented 4 years ago

This is a very good question that requires going deep into Bazel to fully answer.

Native Extensions are Supported in gem dependencies expressed via Gemfile and ruby_bundle, however they will likely be compiled using the host compiler, not the tool chain

AAFAIK We don't currently support building native extensions cross platform.

So you could say that the current support within ruby_bundle is incidental.

As far as using it with Ruby library and binary, if you can put together a proper example, it would be great to add something under the examples folder.

sayrer commented 3 years ago

I got this to work here:

https://github.com/sayrer/twitter-text/blob/b018b2b935d1b9b21a1ae46af7d774e4ad0a6d0a/rust_bindings/ruby/BUILD

It's pretty much the same as the linked test file, though. Happy to make an example if you want.

kigster commented 3 years ago

Yes please put together an example with a decent README.

I'm not entirely clear what your build file generates.

I think the comment about not fully supporting native extension refers to not having any explicit way to build gems with native extensions across multiple platforms.

sayrer commented 3 years ago

My build generates a shared library (.bundle on macOS, .so elsewhere) native Ruby extension and then makes it a data dependency of a ruby_binary. Then, the ruby_binary loads the native module from its runfiles.

kigster commented 3 years ago

Wow, this is a very good example! You probably know more about Bazel than I do 😀

Would love to have a small example like that with a genrule and a c++ library.

We could also just reference your BUILD file from our README as an example?

jrcasso commented 9 months ago

@sayrer I could really use your help. Can you explain this a bit further to someone who's new to Bazel? I'm also less familiar with platform architectures in relation to the set of problems bazel is trying to solve; it's my understanding that bazel is intended to make the artifact creation process (read: whether that's a compiled binary or whatever) idempotent.

I do understand that native extensions pose a challenge to bazel because compilation of ruby gems that reference native extensions implement the dynamically linked shared object libraries that have been compiled for that specific architecure. But what I'm lost on is how bazel necessitates that these native extensions are available for the toolchain to compile ruby idempotently in isolation, irrespective of the host architecture. Just to make any misconceptions I might have more obvious, I'll ask questions more formally:

  1. This usage of data dependencies, as I understand the intent from documentation, violate idempotency:

    A build target might need some data files to run correctly. These data files aren't source code: they don't affect how the target is built.

but if the artifact compiled by bazel makes references to a compiled library on the host system, the "sandbox" is leaking to host-compiled dependencies. My understanding is that a host that has compiled a native extension may have other dependencies that bazel is not aware of, thus breaking the idempotency of the overall build.

  1. Is it actually true that bazel can build artifacts for many architectures even if the build is running on a specific architecture?

Any help you can provide would be appreciated. Really, even if it's a one-sentence response, it'll help me dig.

sayrer commented 9 months ago
  1. This usage of data dependencies, as I understand the intent from [documentation]

If you mean this line: data = [":twitter_text_shared_library"],

then it's possible to make this pretty tight. That is an .so file built by Bazel. So if I change the Rust/C++ that make up that library from another directory, the Ruby binary will require a rebuild (probably just a file copy here). I don't think this project bothers to make the C/C++ toolchain really hermetic, but it certainly could. See for example:

https://github.com/buildbuddy-io/buildbuddy-toolchain

In general, practicing these concepts on a free BuildBuddy account is a good way to go. They provide a lot of dashboards and things so you can see how things are working. I think this project has a few references to host binaries because I was hitting bugs with the sandbox, but I think they'll probably be fixed now.

  1. Is it actually true that bazel can build artifacts for many architectures even if the build is running on a specific architecture?

Any help you can provide would be appreciated. Really, even if it's a one-sentence response, it'll help me dig.

Yeah, it's covered here: https://bazel.build/extending/platforms

You can build for Linux-ARM64 on Linux-x64, for example. But you can't build every platform from every platform (e.g. macOS, iOS, require an Apple host).

But what I'm lost on is how bazel necessitates that these native extensions are available for the toolchain to compile ruby idempotently in isolation, irrespective of the host architecture.

Ah, so when you initialize the Workspace, it will compile a copy of Ruby (or it can use the host) [edit: it looks these rules always depend on a host version, but I bet you could make it work via rules_nix as used in this project for SWIG]. So, there will be a copy of Ruby in your bindir that depends only on what's in your workspace, and a copy of those headers. These directories are available for each platform in debug and release mode.

See here:

cc_library(
    name = "twitter_text_lib",
    srcs = [
        ":twitter-text-ruby.cpp",
        "//rust_bindings/cpp:twitter_text_h",
        "//rust_bindings/cpp:twitter_text_cpp",
    ],
    deps = [
        "@org_ruby_lang_ruby_toolchain//:headers",
        "//rust_bindings/cpp:twitter_text",
    ],
    tags = ["manual"],
    linkstatic = True,
    alwayslink = True,
)

See how it refers to Bazel's copy of the Ruby headers in deps?

Then, we decide how to build the shared library based on the target architecture.

config_setting(
    name = "requires_bundle",
    constraint_values = ["@platforms//os:osx"],
)

filegroup(
    name = "twitter_text_shared_library",
    srcs = select({
        ":requires_bundle": ["twittertext.bundle"],
        "//conditions:default": ["twittertext.so"],
    }),
)

So, here decides which architecture to build for, and that will require C++/Rust builds for the target architecture. Basically, it works backward from the target binary's requirements. You'd get the default condition there for all Linux architectures, but it would build it for both x64 and ARM if you were trying to do both.