google / flatbuffers

FlatBuffers: Memory Efficient Serialization Library
https://flatbuffers.dev/
Apache License 2.0
22.52k stars 3.19k forks source link

[Rust] Add better support for "crate-per-schema" #8273

Open adsnaider opened 3 months ago

adsnaider commented 3 months ago

Currently, flatc allows using --rust-module-root-file and --gen-all to generate multiple schemas into a single crate with a top-level mod.rs. This is good but makes it really hard to use in many contexts since the best (only?) option to have inter-dependent schemas is to generate everything together into a single crate.

Ideally, we can generate each schema independently of the includes (as each include will be its own generated crate), and link them all at build time.

adsnaider commented 3 months ago

I don't particularly care about doing this through flatc directly. We use Bazel downstream and I have some patches to implement this, but we're somewhat behind master so it might take me a bit of time to send a PR. It is on my radar though.

The big thing that needs to change is the use per module need to be more specific, for instance:

// foo.fbs
namespace foo;
// ... some definitions
// foobar.fbs
include "foo.fbs"
namespace foo.bar;
// ... some definitions

The generated code should be like

// foo_generated.rs

use foo_generated::*;

pub mod foo {
  use foo_generated::foo::*; // As opposed to currently `foo_generated::*;

  pub mod bar {
    // No need to include foo_generated here since it doesn't live in `foo.bar`
  }
}

I think the correct approach is to include [some_generated_dep]::[some_module_tree]::* if and only if [some_module_tree] is a subset of [some_generated_dep]'s namespace. I'm hoping someone will tell me if that doesn't work on every case :)

adsnaider commented 2 months ago

Unfortunately, this doesn't work when the imports are ambiguous. For instance:

foo.fbs -> namespace a1.b1.c1 - includes bar and baz
bar.fbs -> namespace a1.b2.c1
baz.fbs -> namespace a1.b2.c1

This will result in bar::a1::* and baz::a1::* being imported into the same a1 module which result in an ambiguous include of b2.

We will probably have to change the code generation to use absolute paths instead for our use case. Would love to upstream our work once it's ready. Would this be a reasonable contribution or would this change to absolute paths break other use cases?