bazel-contrib / SIG-rules-authors

Governance and admin for the rules authors Special Interest Group
https://bazel-contrib.github.io/SIG-rules-authors/
Apache License 2.0
28 stars 12 forks source link

WIP: add rules-keeper #78

Open ashi009 opened 1 year ago

ashi009 commented 1 year ago

DO NOT SUBMIT

This CL implements rule-keeper, a GitHub App, to make rule authors' workflow smoother. For now, it will only collect the data and present them on the ruleset catalog.

As of this CL:

To try this out:

With Github App:

With Personal Access Token:

Updates #53

ashi009 commented 1 year ago

@kormide PTL

kormide commented 1 year ago

We should make a new repository on the SIG account so we can start getting this in in smaller chunks. @alexeagle Do we need to have a vote with the SIG members before we create one?

kormide commented 1 year ago

nit: I might name the proto folder schema since protobuf is just an implementation detail.

kormide commented 1 year ago

I'm not super familiar with protobuf. For the rules_ts data you pulled, are the csv and METADATA files a serialization of the protobuf data? I always thought that protobuf was transferred (and stored) in some kind of binary format?

kormide commented 1 year ago

Overall it's looking good so far. It might be good to start incrementally building out the interface as you build the schema to verify that we have all the information we'll need.

ashi009 commented 1 year ago

I'm not super familiar with protobuf. For the rules_ts data you pulled, are the csv and METADATA files a serialization of the protobuf data? I always thought that protobuf was transferred (and stored) in some kind of binary format?

I figured that the CSV is a compact way of storing time series, which is simple and git-friendly (packfiles). In the meanwhile, we can plot them with existing tools, say gnuplot. Which will make our next step easier.

The METADATA is the corresponding metadata to the CSV file, which is defined in protobuf. The sole reason for using protobuf is to make marshal/unmarshal easier. Protobuf's wire format is binary, but it also has a text format, which is human-readable and git-friendly. It's totally possible to store the time series in proto as well, but we will end up with bloated text files.

That's why I ended up with a combination of two.

ashi009 commented 1 year ago

nit: I might name the proto folder schema since protobuf is just an implementation detail.

It's kind of a custom to put proto files in a proto directory. It's an implementation detail for sure, but this directory name will leak into import paths and package names, say github.com/.../rulekeeper/proto and com.bazel.contrib.rulekeeper.rulsets.proto. Having proto in it allows developers (even copilot) to know what's inside without reading the generated code.

Once I introduce rules_go to the repo, I'll remove the generated code. It will make the proto part more important as the source won't even exist in the tree.