Closed kgutwin closed 3 days ago
Great feedback! I put together a small test and renamed the flag to --ir-type
.
As a related question (towards the underlying purpose of this PR) I am wondering what you think about adding automated code generation to this repo, for the purposes of keeping type hints for the bindings in sync with the gradual evolution of PRQL's IRs. In my main project now, we use Pre-commit hooks to trigger a re-run of client code generation whenever our backend's OpenAPI spec is updated. This works really well, as the hooks ensure that a commit with changes to the backend will also include the corresponding changes to the frontend and other client libraries.
If PRQL were to adopt the same general practice, it would involve adding one or two pre-commit hooks:
prqlc debug json-schema
any time there's a change to the IR definition, and the output could be stored in the repo. This would allow casual browsers of the repository to read the JSON Schema without needing to run the tool, and/or the schema could be included in the documentation/web site. But if that doesn't sound useful, this step can be bundled together with the next one.If you think this approach would work for PRQL, I can put it together as a PR for review. If you have other opinions or thoughts, I'm happy to accommodate. Thanks!
(forgive the delay; just got back from vacation)
Optionally, pre-commit could run
prqlc debug json-schema
any time there's a change to the IR definition, and the output could be stored in the repo. This would allow casual browsers of the repository to read the JSON Schema without needing to run the tool, and/or the schema could be included in the documentation/web site. But if that doesn't sound useful, this step can be bundled together with the next one.
Storing them in the repo is a great idea (one of the reasons we like snapshots a lot!)
pre-commit
is a clever way of doing that. One constraint is that cargo run -p prqlc -- debug json-schema
won't run in pre-commit-ci
, because that doesn't have internet access, which is required for building the crates. So we could instead:
.snap
file with a header (or until we find a solution to https://github.com/mitsuhiko/insta/issues/353#issuecomment-1464986801 over at insta)pre-commit
in standard GHA CIPre-commit would also then run any appropriate code generation tools (I'm currently looking at datamodel-code-generator for Python, for example) whenever the schema changes. Code generation from schema is valuable because it imposes no extra runtime overhead for bindings, and it stays in lockstep with versioning, which is important for an actively evolving project like this one.
This sounds interesting. I don't have much experience with these so don't know how well they work. IIUC most folks using bindings aren't going deep in the AST (vs. just compiling to SQL), but if this would be helpful + the tools are good — in particular if they don't impose a cost on those just compiling to SQL, would be very open to it.
Thanks @kgutwin !
This PR adds the command
prqlc debug json-schema --schema-type TYPE
. When run, this dumps a JSON Schema document for the provided type (currentlypl
,rq
, andlineage
).Example:
The longer-term goal behind adding this as a feature is to find a way to auto-generate type hints for library integrations (Python, TypeScript, etc.) However, this may also be useful as a debugging or documentation tool.