bufbuild / protobuf-es

Protocol Buffers for ECMAScript. The only JavaScript Protobuf library that is fully-compliant with Protobuf conformance tests.
Apache License 2.0
1.16k stars 70 forks source link

Provide support for top-level exports #455

Open smaye81 opened 1 year ago

smaye81 commented 1 year ago

This is to open discussion about the implementation of top-level exports when generating JS/TS code via the plugin framework.

Problem

Specifically, what we're exploring is that given the following:

package foo;

message Foo {
  string foo_field = 1;
}
package bar;

message Bar {
  int32 bar_field = 1;
}

We would provide the option of generating:

.
├── index.js
├── package.json
└── proto
    ├── foo
    │   ├── foo_pb.d.ts
    │   └── foo_pb.js
    └── bar
        ├── bar_pb.d.ts
        └── bar_pb.js

where index.js looks something like:

export { Foo } from "./proto/foo/foo_pb.js";
export { Bar } from "./proto/bar/bar_pb.js";

This functionality would ostensibly be opt-in through a plugin option of something like generateIndex or something similar.

Benefits

This would provide a few benefits (note that below benefits would also require associated changes to / creation of a package.json file.

Difficulties

There are difficulties associated with this however. The biggest being how we handle name clashes with exported symbols.

Since Protobuf has namespaces, it is perfectly acceptable to have two messages (or services or enums) with the same name in different namespaces. For example, consider:

package foo.v1;

message Foo {
  string foo_field = 1;
}
package foo.v2;

message Foo {
  int32 bar_field = 1;
}

Now the generation of the index.js file becomes much more complicated because this wouldn't work:

export { Foo } from "./proto/foo/v1/foo_pb.js";
export { Foo } from "./proto/foo/v2/foo_pb.js";

So we would have to somehow alias one of these exports to something like:

export { Foo } from "./proto/foo/v1/foo_pb.js";
export { Foo as v2_Foo } from "./proto/foo/v2/foo_pb.js";

Even this solution is not straightforward and is further complicated by the following:

Potential Solutions

Note that all of these solutions only apply when the option to generate top-level exports is true.

  1. Alias all the things all the time, using the fully-qualified package name as the alias.
  2. Only alias when a conflict occurs and further, only alias the conflict using the smallest possible package path (see example above with v2_Foo.
  3. Throw an error at generation time if a conflict is detected, alerting the user which types conflict. (this is probably not a great idea, but listed here for visibility).
  4. Only generate top-level exports within a package. This would eliminate the conflict issue altogether as exported symbols within a package are guaranteed not to clash. However, as one might have guessed, this also has its problems:

    • This would need subpath exports defined in package.json, not just a plain exports field since we would end up with multiple entrypoints per path. We would also most likely have to specify types fields as well in this subpath export map.
    • Subpath patterns may or may not be possible here, we would have to test different versions of TypeScript, Node.js, and popular bundlers to get some confidence that it works well enough.
    • This will only work for ESM generated code, not for CJS code that the protocolbuffers/js and grpc/web plugins generate.
    • Since npm artifacts are expected to be stable (i.e. not change without a version bump), we have to introduce this feature in remote packages by bumping the version number in the registry URL (https://buf.build/gen/npm/v1/). If we discover an incompatibility later, we have to bump the registry URL version again for a fix or rollback.
    • All this really provides is that users would no longer need the filename in the import path, which seems like a small ROI. So
      import { Foo } from "@buf/bufbuild_foo.bufbuild_es/proto/foo"

      instead of

      import { Foo } from "@buf/bufbuild_foo.bufbuild_es/proto/foo/foo_pb.js"
onyxraven commented 11 months ago

I've just been exploring this kind of issue, it doesn't look like it's received much attention.

We currently have a package generation process for both protobuf.js and the protoc-gen versions that, post generation, walks the directory tree and generates index files that look similar to:

.
├── index.js
├── package.json
└── proto
    ├── foo
    │   ├── foo_pb.d.ts
    │   ├── foo_pb.js
    │   └── index.js
    └── bar
        ├── bar_pb.d.ts
        ├── bar_pb.js
        └── index.js

the 'leaf' index.js look like

module.exports = {
  bar_pb: require("./bar_pb"),
};

and the folder index.js (for each level of package namespace)

module.exports = {
  foo: require("./foo"),
  bar: require("./bar"),
};

(or some mix inbetween, plus their corresponding d.ts files).

This results in being able to do

import { foo, bar } from 'package';
something(foo.foo_pb.Foo);
something(bar.bar_pb.Bar);

The addition of the 'filename+_pb' here always kind of bugged me (I thought it would be nice to be able to use foo.Foo here) and I started exploring using the bufbuild/protogen package, but the trick is that buf itself (and maybe protoc) executes the generator at a per-packagename/directory level, so backing into the common 'upper level' packages to create the indexes isn't real possible except by some post-processor. Perhaps theres something in between to be used.

In any case, so far, a post-process script that walks the structure seems to be the answer right now. For what its worth, I haven't found any other language generator that does produce the packaging artifacts or provide 'meta' like this, either.

SpareShade commented 10 months ago

Hi,

Without going into the detail and depth of the original proposal, and only referring to the issue we are having, my vote is to:

1. Alias all the things all the time, using the fully-qualified package name as the alias.

I'll give a simple example where we were forced to change message names, in order to avoid import clashes.

package valueobjects;

// Message representing postal address value object
message PostalAddress {
  string street = 1;
  string city = 2;
  ...
}
package resources;

import "valueobjects.proto";

// Message representing postal address entity
message PostalAddress {
  string id= 1;
  valueobjects.PostalAddress postal_address = 2;
  ...
}

when complied we get

import { PostalAddress } from "../../../../valueobjects_pb.js";

/**
 * Postal address message.
 *
 * @generated from message resources.v1.PostalAddress
 */
export const PostalAddress = proto3.makeMessageType(
  "resources.v1.PostalAddress",
  () => [
    { no: 1, name: "id", kind: "scalar", T: 9 /* ScalarType.STRING */ },
    { no: 2, name: "postal_address", kind: "message", T: PostalAddress },
  ],
);

which fails with error

Identifier 'PostalAddress' has already been declared

The above messages compile just fine in go .

Having to change message names to avoid import/export name clashes for JS feels penalising.

Thank you for opening this conversation.

jcready commented 10 months ago

The PostalAddress name clash error seems like an unrelated bug that should have its own GH issue and should be fixed. The import should be aliased in the compiled code.

timostamm commented 10 months ago

@SpareShade, echoing what jcready said: This sounds like a bug, regardless of top-level exports. Can you open an issue?

timostamm commented 10 months ago

@SpareShade, the collision is fixed by #688 in v1.7.2.

onyxraven commented 10 months ago

Thanks for the fix @timostamm !

Back to the original ask,

I have the start of a protoc plugin (using the bufbuild toolkit) that can generate the indexes with exports. I'm working through corporate approval to move it to our opensource org, and will follow up here. It does require adding strategy: all to the plugin in buf. Currently it only creates the .ts files (the autogenerate of js/dts was not working, so I still need to build the functions/templates to create those).

SpareShade commented 9 months ago

thank you @timostamm & @jcready ... my apologies for being slow/polluting the thread