cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.
https://www.cockroachlabs.com
Other
29.55k stars 3.7k forks source link

docs: auto-generated bnf files embed too much data #21081

Closed knz closed 6 years ago

knz commented 6 years ago

So here's a situation I'm encountering on #21063 with an external contributor: https://teamcity.cockroachdb.com/viewLog.html?buildId=458831&buildTypeId=Cockroach_UnitTests

This is really a reflection on commit 9126d2f1fa33668024fbbd071867b40b24035743.

Notice how this contributor is now getting a linter error on generated doc files even though he was careful to exclude the new statement from docs.

There are two issues here:

On the one hand, I agree that the SVG files were too brittle and too hard too re-generate exactly. However the change in commit 9126d2f1fa33668024fbbd071867b40b24035743 is causing too much data from the grammar to spill over to bnf files, in turn causing lint errors in places that really should not be causing them.

knz commented 6 years ago

The problem is worse than I thought. Just reordering the rules in the grammar file breaks things. See e.g. https://teamcity.cockroachdb.com/viewLog.html?tab=buildLog&buildTypeId=Cockroach_UnitTests_Check&buildId=459878

I think this is just not right. I worked hard to ensure sql.go wasn't checked in to the repo any more so that we don't have to deal with spurious diffs in generated files and merge conflicts. Now we have the same problem what the removal of sql.go was aiming to fix, just in a different area.

knz commented 6 years ago

Jordan just taught me what I need to do:

 make generate PKG=./docs/...

Also I'm noticing it's blazing fast. I do not see any reason not to do that only when needed - during doc generation, instead of committing the generated files.

benesch commented 6 years ago

I don't think those BNF files are ever used, actually. They exist solely to provide human-readable diffs of something approximating the railroad diagrams during review of SQL grammar PRs.

I'm with you, @knz: I hate checking in generated files. Showing diffs of generated files should be solved with better review tooling, not by checking them into source control. That said, checking them into source control solves a real problem and works today, and I don't see any easy way to teach Reviewable and GitHub to display the generated diffs. I think everyone who set this up (me, @dt, @mjibson) is very open to a better solution, though.

In the meantime we should streamline this process. At the very least, make generate should run on ./docs/... by default.

knz commented 6 years ago

How would you feel about a pre-commit hook that populates the diff into the commit message, so it shows up clearly during reviews?

benesch commented 6 years ago

I guess that would solve the conflicts-in-generated-files issue, but it would have rather adverse effects on the usefulness of git log.

knz commented 6 years ago

Perhaps we can "compress" the bnf changes to highlight the areas where the changes occur without copy-pasting the entire diff.

maddyblue commented 6 years ago

Closing because we now do this automatically during make.