frequenz-floss / frequenz-api-common

Shared protobuf definitions and Python bindings for Frequenz APIs
https://frequenz-floss.github.io/frequenz-api-common/
MIT License
1 stars 10 forks source link

Document rules on the protobuf directory/file structure #186

Open llucax opened 5 months ago

llucax commented 5 months ago

What's needed?

We need to make sure that certain rules are maintained when adding more files to the repository to make sure that the design of the different languages bindings/wrappers don't break.

Here is a (non exhaustive) list of rules we observed lately:

Proposed solution

Document these rules in the README or CONTRIBUTING guide.

tiyash-basu-frequenz commented 5 months ago

Looks good overall. I'd call these as general guidelines for structuring protobuf messages.

I have a few comments:

Every directory should contain a file named the same as the directory, which should be the main "entry point". For example, if there is only one file in x/y/z/ then it should be named x/y/z/z.proto (it would contain the message M)

This is not a rule we follow. The existing files following these patterns are not entry points (calling them such could mislead protobuf-beginners), but just messages that are not specific enough to get a file of their own. Also, there is no rule that these files should be present (example). It should rather depend on the composition of the protobuf messages. I do not think we should add this. If people really want a guideline like this, then I'll propose having one file per protobuf message/enum.

If there are more files in x/y/z/, then x/y/z/z.proto can import any other files in x/y/z/ but not the other way around (x/y/z/a.proto should never import x/y/z/z.proto).

Given my feedback about the previous point, I'd suggest rewording this: "If there is a file named x/y/z/z.proto, then x/y/z/z.proto is allowed to import any other files ..."

Not sure if they are allowed at all in protobuf, but cyclic files dependencies should be avoided.

Not sure if protolint catches those, but rust or python code-generation would certainly fail. We need to ensure that the CI has such tests.

llucax commented 5 months ago

This is not a rule we follow. The existing files following these patterns are not entry points (calling them such could mislead protobuf-beginners), but just messages that are not specific enough to get a file of their own.

I see, makes sense, feel free to update the description of the issue to put that instead.

I do not think we should add this. If people really want a guideline like this, then I'll propose having one file per protobuf message/enum.

I don't think this is necessary, we would end up in java :P I think it makes sense to leave it to the developer's discretion.

This approach still matches well Python module system, I misunderstood the structure a bit, I thought only the stuff in package/package.proto belonged to the protobuf package package, and all other files in package/ had their own sub-package, like package/blah.proto was in package package.blah. If inside a directory everything is flat, then in python it would be the equivalent of a module, and if we want to keep the same file separation structure we can always map package/file.proto to package/_file.py and then have package/__init__.py just exporting all the public symbols in package package.

So for me it makes perfect sense as it is an we can update the docs in the issue to reflect that.

Given my feedback about the previous point, I'd suggest rewording this: "If there is a file named x/y/z/z.proto, then x/y/z/z.proto is allowed to import any other files ..."

Makes sense. Even more, I wonder if it makes sense to have the restriction of only x/y/z/z.proto being the only file that can import stuff from other files in x/y/z/z, since it is not really special, maybe it's enough to mention there shouldn't be circular dependencies inside a package/directory.

Not sure if protolint catches those, but rust or python code-generation would certainly fail. We need to ensure that the CI has such tests.

I guess you mean actually somehow parsing the proto files and looking for circular dependencies? Because we want to remove the generation from the proto/spec repo, so we shouldn't get those in this repo anymore in a near future.

tiyash-basu-frequenz commented 5 months ago

I don't think this is necessary, we would end up in java :P

indeed :)

Even more, I wonder if it makes sense to have the restriction of only x/y/z/z.proto being the only file that can import stuff from other files in x/y/z/z, since it is not really special, maybe it's enough to mention there shouldn't be circular dependencies inside a package/directory.

Fair enough. So this point can be omitted as well.

I guess you mean actually somehow parsing the proto files and looking for circular dependencies? Because we want to remove the generation from the proto/spec repo, so we shouldn't get those in this repo anymore in a near future.

No, I just meant generating either rust or python files in tests to ensure nothing breaks. I do not feel confident relying on protolint for stuff like this. So we could just generate rust or python code in tests. This code would be throwaway, but it's inexpensive and it would provide better reliability.

llucax commented 5 months ago

No, I just meant generating either rust or python files in tests to ensure nothing breaks. I do not feel confident relying on protolint for stuff like this. So we could just generate rust or python code in tests. This code would be throwaway, but it's inexpensive and it would provide better reliability.

I see, I'm not sure how inexpensive it is, as for rust the toolchain would have to be installed and compilation might take some time, and for Python I'm not sure if we just import everything in one module and run python on it if it would be enough to detect cycles (probably yes, but I'm not 100% sure), but I agree it would be good to have some sanity checks.

What I don´t like much about this approach is that we'll be pulling in again a lot of unrelated lang-specific stuff (even if it is only for testing) into the repo that we wanted to actually remove.

Maybe we should check if there isn't any built-in check for this in protoc, maybe we only need to check protoc runs.

Anyway, I guess it is something to investigate in the future, for the scope of this issue I would just document that cyclic dependencies should be avoided.

tiyash-basu-frequenz commented 5 months ago

Maybe we should check if there isn't any built-in check for this in protoc, maybe we only need to check protoc runs.

Exactly. We do not need to set up any project for this. We can just run protoc to generate the code (it doesn't even have to be rust or python). This should be a very simple addition to the CI.

llucax commented 5 months ago

I updated the issue description.