pganalyze / pg_query.rs

Rust library to parse, deparse and normalize SQL queries using the PostgreSQL query parser
MIT License
126 stars 12 forks source link

Add recursive iter() on ParseResult and NodeRef #44

Open absporl opened 2 months ago

absporl commented 2 months ago

This allows walking over all nodes in the AST, instead of just a limited subset as in the current nodes() function. See https://github.com/pganalyze/pg_query.rs/issues/31

The implementation uses static code generation in build.rs. The protobuf definitions are parsed, and a graph of all Message types is constructed. All NodeRef types are given an unpack() function, that recursively calls unpack() on all relevant fields (i.e., the fields that have a Node type, or that have a type that eventually has a Node type as a field).

The result is guaranteed to visit all nodes. The code generation mechanism is maybe also useful to replace parts of the codebase that currently need to be manually hardcoded.

Adds prost, prost-types and heck to the build dependencies, and updates the prost dependency version.

psteinroe commented 1 month ago

hey @absporl, I was just going to start working on the same. I am wondering why did you choose this implementation?

I am using proc macros with great success (eg here) and wanted to upstream the same to pg_query. Do you see any benefits of the one or the other?

My idea was to essentially generate most of the crate via proc macros, including enum variants etc.

absporl commented 1 month ago

Hi @psteinroe , I like your approach! It looks a lot nicer and cleaner than what I wrote. I'm not super familiar with Rust macros and only briefly considered them. My reason for the current approach is that it's close to what prost itself does, and I wanted to ensure that it would not affect compilation time of the users of this crate. But I guess the latter might also be true for macros (as in are they only evaluated when you compile the crate itself, and not when you import a crate that uses them?)?

I did have an earlier attempt where I tried to use prost-reflect to get inspectable protobuf messages at runtime, and use those, but that turned out to be quite complicated, and I also did not want runtime performance costs really.