pganalyze / pg_query_go

Go library to parse and normalize SQL queries using the PostgreSQL query parser
BSD 3-Clause "New" or "Revised" License
663 stars 79 forks source link

Get normalized query args #91

Open alextanhongpin opened 1 year ago

alextanhongpin commented 1 year ago

Currently, I'm using pg_query.Normalize to get a normalized query. However, there are no functionality that allows extracting all the args. Is it possible to have that in future release?

lfittl commented 1 year ago

Hi @alextanhongpin Thanks for opening this issue!

When you say extracting all the args - what exactly are you referring to? (a specific example of how you'd like this to work would be helpful)

alextanhongpin commented 1 year ago

Here's an example query:

-- Query
SELECT
  *
FROM
  users
WHERE
  email = 'john.doe@mail.com'
  AND deleted_at IS NULL
  AND last_logged_in_at > $1
  AND created_at IN ($2,)
  AND description = e'foo bar walks in a bar, h\'a'
  AND subscription IN ('freemium', 'premium')
  AND age > 13
  AND is_active = true
  AND name LIKE ANY ('{Foo,bar,%oo%}')
  AND id != ALL (ARRAY[1, 2]);

And this is what I execute to get the normalized query:

    norm, err = pg_query.Normalize(query)
    if err != nil {
        return
    }

I will get the following output:

-- Query Normalized
SELECT
  *
FROM
  users
WHERE
  email = $3
  AND deleted_at IS NULL
  AND last_logged_in_at > $1
  AND created_at IN ($2,)
  AND description = $4
  AND subscription IN ($5, $6)
  AND age > $7
  AND is_active = $8
  AND name LIKE ANY ($9)
  AND id != ALL (ARRAY[$10, $11]);

All the hardcoded values in the SQL has been replaced with the dollar placeholder. I want to be able to extract the values as such:

-- Args
{
 "$1": "2023-06-27",
 "$10": 1,
 "$11": 2,
 "$3": "john.doe@mail.com",
 "$4": "foo bar walks in a bar, h''a",
 "$5": "freemium",
 "$6": "premium",
 "$7": 13,
 "$8": true,
 "$9": "{Foo,bar,%oo%}"
}
lfittl commented 1 year ago

Thanks for clarifying!

That is not a feature that exists today, but you could probably adjust the logic in the underlying C code to save the extracted parameters into a separate data struct. See https://github.com/pganalyze/libpg_query/blob/15-latest/src/pg_query_normalize.c#L275 for the relevant code piece.

We don't have any immediate plans to add this functionality ourselves, but if you're interested in this and are willing to dive into the C code, PRs would be welcome on libpg_query (I would make this an opt-in feature via a flag on the normalize function, since its a special use case that probably adds a slight amount of overhead for large queries).