trailofbits / blight

A framework for instrumenting build tools
https://pypi.org/project/blight
Apache License 2.0
83 stars 8 forks source link

Actions: write a record-style action that emits `compile_commands.json` #43447

Open woodruffw opened 1 year ago

woodruffw commented 1 year ago

The JSON Compilation Database format is about as close as there is to a standard for recording each step in a compilation.

Blight currently has Record, which generates a blight-specific JSON(L) output format. We should also support compile_commands.json, probably with a separate action.

cc @kumarak and @pgoodman as potential end users.

woodruffw commented 1 year ago

The only major "gotcha" here is that blight's architecture is stateless, meaning that each action runs once per wrapped tool invocation. That's why all of the current actions emit JSONL -- it's much faster and easier to append newline-separated records than it is to to re-parse and update a JSON file on each step (particularly when steps happen in parallel).

That leaves two options:

pgoodman commented 1 year ago

Do you have any atomicity guarantees with JSONL? What prevents concurrent writes from being interleaved if a particular command JSON struct is sufficiently long?

woodruffw commented 1 year ago

What prevents concurrent writes from being interleaved if a particular command JSON struct is sufficiently long?

Right now I have a little flock_append decorator that uses exclusive locking to get around any underlying concurrency:

https://github.com/trailofbits/blight/blob/92b3c190e848055239cb0d6cacfd0edab1f2608e/src/blight/util.py#L201-L218