[Discuss] Could bindings generators live in separate crates?

rfk commented 3 years ago

For our current stage of development, I like the way that we currently bundle the bindings generators for all languages as part of the main uniffi_bindgen crate. It simplifies testing and it (mostly) ensures that we don't leave any languages behind. But I wonder if it's the right thing long term:

As we've seen with the Gecko backend, some integrations are very hard to test in CI. If the Gecko backend was able to live in mozilla-central and be tested as part of Firefox CI then I expect that would be simpler.
We currently include a Python backend, but it's falling behind, and I kind of get the impression it's hard to prioritize reviews for that code because it's not on the critical path for our consumers. We might be better off if I moved that into a separate repo and maintained it as a "passion project".
If we get more contributors, they might be coming from a different language and have their own set of needs, and they might find it hard to add an additional backend and keep it up to date, similar to the Python one above.

Other projects in the Rust ecosystem appear to have had some success in externalizing specific backends into separate crates, e.g. serde. What would it look like for us to take a similar approach, and when (if ever) would it be worth the costs of doing so?

┆Issue is synchronized with this Jira Task ┆Issue Number: UNIFFI-26

jhugman commented 3 years ago

I agree with all of your above; especially around Gecko, and backends from contributors.

It might be worth writing down the necessary pre-conditions before we'd consider doing this:

finalize the intermediate representation between parser and bindings that we can support in a backwards compatible manner, i.e. ComponentInterface and friends.
finalize the shape of the generated rust scaffolding, which new languages need will be calling, e.g. errors, strings, destructors, handles etc
document the binary protocol for packing and unpacking of records, sequences, optionals, etc.

FWIW: I think as we get closer to maturity for the shape of how backends should be written, the python binding will be easier to review, test and keep up. I'm optimistic now test_rondpoint.py is running, adding to the third backend will be considerably faster than the first and second.

rfk commented 3 years ago

I've been thinking a bit more about this in the context of #435 and the resulting Ruby backend. I think I was actually over-complicating it in my head a bit, and we more-or-less have all the pieces that we'd need to start restructuring things in this way.

Currently the uniffi-bindgen executable has three main jobs:

When run as uniffi-bindgen scaffolding, it reads the .udl file and generates the corresponding Rust scaffolding. This currently does not depend on any details of the intended foreign-language consumer(s).
When run as uniffi-bindgen generate -l $LANG, it reads the .udl file and generates the corresponding bindings for the given foreign language. Importantly, this is based only on the contents of the .udl file and and language-specific config.
When run as uniffi-bindgen test, it figures out how to run a test script for a specific foreign-language binding. This is mostly intended for use by automated tests produced by macros from the uniffi::testing support crate and not for calling by hand.

We can leave the first of those jobs intact in the main uniffi_bindgen crate, and move the other two into language-specific crates that take a cargo dependency on uniffi_bindgen. Perhaps organized something like the following, although of course this is just a rough sketch.

The Consumer View

For running on the command-line, users would need to install both the uniffi_bindgen crate as well as e.g. uniffi_bindgen_kotlin and uniffi_bindgen_swift. These would install corresponding binaries named uniffi-bindgen, uniffi-bindgen-kotlin and uniffi-bindgen-swift, by analogy to how cargo sub-commands work.

When invoking uniffi-bindgen generate -l $LANG, the main executable would forward its command-line arguments on to uniffi-bindgen-$LANG, perhaps lightly normalized for simplicity. It's just a direct pass-through execution and otherwise works exactly like the current all-in-one executable. We could add a special command-line flag to assert compatibility of uniffi versions, so that a call like this:

uniffi-bindgen generate --language kotlin ./path/to/my.udl

Would in turn shell out to the kotlin-specific backend like:

uniffi-bindgen-kotlin generate --uniffi-version="v0.11.0" --out-dir=resolved/output/dir --config-path=resolved/config/path ./path/to/my.udl

A similar setup could work for test scripts, although we'd have to think a bit about the command-line API surface for that one. When it sees the --uniffi-version flag, the language-specific bindgen program would be expected to check for compatibility with that version of the crate and error out if there's a mis-match, similar to what the Rust scaffolding already does.

The user doesn't have to know about any of that, however - they just have to know to install the backends they want.

For users who want to integrate into a broader build system rather than installing the tools at the system-level, we could support an extension of the pattern used in application-services. The consumer would need to make a wrapper crate that depends on all three of uniffi_bindgen, uniffi_bindgen_kotlin and uniffi_bindgen_swift, and stitch together their exposed public APIs to make a combined binary. They might end up making a little mini wrapper crate that looks like:

fn main() -> anyhow::Result<()> {
    uniffi_bindgen::run_main((
        uniffi_bindgen_kotlin::KotlinBackend,
        uniffi_bindgen_swift::SwiftBackend,
    ))

And then executing that via cargo run as part of their build. It would behave just like running uniffi-bindgen on the command-line, but limited to the specified backends.

This could actually be a small concrete win for the application-services build setup, because we wouldn't need to compile the Python or Ruby or other backends as part of the application-services build.

Notably absent here is the need for any sort of serialized internal representation to be passed between the executables - everything they need to know is already in the .udl file, and the --uniffi-version flag would ensure that they interpret it in the same way.

The Developer View

To make writing language backends as simple as possible, we would try to keep as much infrastructure as possible in the uniffi_bindgen crate, and have each uniffi_bindgen_$LANG crate depend on it for core functionality. Perhaps something like:

The main uniffi_bindgen crate provides:
- The ComponentInterface data structures as officially-supported public API.
- The implementation of the scaffolding command and its related templates.
- The main.rs implementing the uniffi-bindgen command as described above.
- A public trait ForeignLanguageBackend designed to be implemented by language-specific crates. This would broadly mirror the shape of the top-level modules in the current uniffi_bindgen::bindings module, with methods like:
  - ::new(config) for creating an instance of the backend from config data
  - ::generate(&self, ci: &ComponentInterface) -> Result<FileTreeThingy> for generating the bindings from a parsed component interface into some in-memory format
  - ::run_script(&self, current_dir: &Path, script_file: &Path -> Result<()> for running a test script
- A function that takes a generic <B: ForeignLanguageBackend> and does all the plumbing to execute the command-line tool for a specific language backend.
- Some thorough docs on how you're supposed to translate a ComponentInterface into code.
Each uniffi_bindgen_$LANG crate provides:
- A concrete implementation of ForeignLanguageBackend targetting that language
  - All the templates etc necessary to implement it.
  - All the logic for running test scripts etc in that language.

Perhaps we could also publish some of our testing crates in a way that the individual language backends could import and use them, spreading out e.g. test_coveralls.kts to live with the Kotlin backend, test_coveralls.swift to live with the Swift backend, etc.

One risk here is that we amplify the cost of breaking changes, and particularly of breaking changes in the "ABI" of how data gets lifted and lowered. My personal sense is that it will be worth the reduction in overall system complexity that we get from splitting things into more separable components.

mhammond commented 3 years ago

How would that plan fit with #416?

rfk commented 3 years ago

How would that plan fit with #416?

I think it would be OK, because the code for each backend doesn't need to actually parse the .udl file, it just needs to operate on a ComponentInterface. If we change uniffi_bindgen to be able to magic up a ComponentInterface directly from Rust code somehow, then each backend can get that for free by updating its dependency to the new version. That's a lot of hand-waving of course, but basically, I don't think this split would make #416 any harder than it already would be - uniffi-bindgen generate still needs to be able to slurp in a ComponentInterface definition from Rust code in either scenario.

tarikeshaq commented 2 years ago

Thinking of taking a shot at this for the hack week! I Will drop updates here and in a possible draft PR / other repo for demo purposes...

artfuldev commented 2 years ago

@rfk @tarikeshaq I'm interested in the work around moving language binding generators into their own crates/repos. Is this something an external contributor (not from mozilla or uniffi-rs team) can pick up? Do we have to continue from #997 or can we start fresh? I'm super excited about uniffi-rs and the general idea of rust being a kind of a universal language and I have ideas for different flavors of languages etc (for example a functional bindings generator for kotlin which doesn't throw but returns Result<T, E>) and different languages. In the near term, I'll be needing a TypeScript bindings generator as well and for the pace of development, I'd like to ensure this work is completed first. If this is something you're open to, I'm willing to put in the time and energy towards this effort as this is super important to me. Please let me know how I can contribute.

Regarding testing, I think the different binding generators can have their own test suites and the community can decide whether a binding generator is well tested or not before they use them, and the current generators can also be moved out safely into repos owned by the same team but links to these repos/crates can be presented in the README as reference, or can be included as default if that is super important (to guarantee test coverage etc).

Apologies if I'm talking too much, I'm just super excited about this.

artfuldev commented 2 years ago

I have a question - why do we call it 'backend'?

rfk commented 2 years ago

Hi @artfuldev, thanks for reaching out!

Is this something an external contributor (not from mozilla or uniffi-rs team) can pick up?

I should start out by saying, I'm not longer with Mozilla and am contributing much less to uniffi-rs these days, but I'm still pretty excited by the idea of allowing generators to live in separate crates and would be happy to help review things if you want to try implementing it. I expect there's still quite a bit of interest in this from the Mozilla folks.

Do we have to continue from #997 or can we start fresh?

I don't think it's necessary to start from #997 (especially as it seems a bit stale with merge conflicts) but it's probably worth taking at look at for some inspiration at least. A lot of #997 is about moving the existing Kotlin code-generator around, but if you ignore the Kotlin-specific parts, does what that PR is trying to do make sense (perhaps in conjunction with my comment above?

I have ideas for different flavors of languages etc (for example a functional bindings generator for kotlin which doesn't throw but returns Result<T, E>) and different languages. In the near term, I'll be needing a TypeScript bindings generator as well

A different way to get started here may actually be to implement one of these backends directly in your own fork of UniFFI alongside its existing backends, as a way to just get more familiar with the codebase and what exactly is entailed in implementing a backend. Once you've got it working inline in the crate there, you'll probably have a good mental model of what would be required to pull the backends out into a separate crate.

bendk commented 2 years ago

I second what @rfk said: Mozilla folks are definitely interested in this and I think that plan to get started makes a lot of sense. If you're on element, feel free to reach out on the uniffi channel (I think it's #uniffi:mozilla.org).

"Backend" means either the bindings or scaffolding. I think it comes from a compiler metaphor: UniFFI is compiling the UDL code and the target output is the Rust/Kotlin/Python/... code.

artfuldev commented 2 years ago

@bendk , @rfk - thanks!

artfuldev commented 2 years ago

Just want to update here that the additional language bindings have fallen a little lower on the priority list for us. Right now we want to get the most out of the swift and kotlin bindings that already work so well (thanks for the amazing work on those). I will get back to this when the additional language support becomes a high priority item for us again - which can be in a few weeks.

tarikeshaq commented 2 years ago

I'll be re-picking this up, we have upcoming work that can benefit from the separation of the binding generators

SalvatoreT commented 5 months ago

@bendk, would it make sense to bring back https://github.com/mozilla/uniffi-rs/pull/1205 now that the versioning issue (https://github.com/mozilla/uniffi-rs/pull/1203) has been solved (assuming it has)?

bendk commented 5 months ago

Yes, I think something like that would be good, although I think we should start fresh rather than trying to resurrect 1205. I was just talking to @mhammond about this yesterday, we hope to do it at some point, but nothing is scheduled.

Is there a particular use case that splitting up the crates would accomplish?

SalvatoreT commented 5 months ago

I think separating out the the language bindings generators would really help with 3rd party language support by making the official examples look more like 3rd party language crates. It'd also help make sure there aren't breaking-API changes.

I'm trying to bring the Kotlin Multiplatform (https://gitlab.com/trixnity/uniffi-kotlin-multiplatform-bindings) UniFFI bindings up-to-date with the latest version. I'm still getting up-to-speed, so maybe I just don't understand everything fully yet.

mhammond commented 5 months ago

Thanks - that's exactly the context in which Ben and I were discussing it. So while we support this, as Ben said, it's not something we have on our short-term plate so would welcome contributions here!

mhammond commented 5 months ago

We'd also welcome other changes which make life easier for 3rd party bindings - eg, TargetLanguage is fairly hostile to them. It's not clear what this means in practice, but we are open to anything which makes sense here.

mhammond commented 5 months ago

I had another look at #1205, and I'm not quite sure I like that each binding is split into its own crate - I think I'd personally mildly prefer a single "uniffi_bindings" crate which held all the builtin languages. I think this would still offer most of the advantages mentioned above (ie, just not being in uniffi_bindgen seems more important than exactly how they are organized) and just seems that little bit cleaner in terms of what we need to publish etc. It wouldn't offer as fine-grained control over making dot-releases for a single binding, but I'm not sure how important that is (ie, a dot-release of a single crate with 4 bindings even if only 1 binding actually changed seems OK to me).

@bendk WDYT?

mozilla / uniffi-rs

[Discuss] Could bindings generators live in separate crates? #299

The Consumer View

The Developer View