julianhyde / quidem

Idempotent query executor
Apache License 2.0
50 stars 7 forks source link

Mask out results before comparing #79

Open julianhyde opened 1 year ago

julianhyde commented 1 year ago

It would be useful if Quidem had a way to mask out results. For example, the following test (from CALCITE-6021) calls the CURRENT_DATETIME function, which returns a value that has a particular format but is different every time:

!mask 6 "[0-9]" "x"
SELECT CURRENT_DATETIME as now;
+----------------------------+
| now                        |
+----------------------------+
| 2016-05-19T10:38:47.046465 |
+----------------------------+
(1 row)

!ok

The test would succeed because '2016-05-19T10:38:47.046465' and '2023-11-11T16:38:47.012345' would both be converted to 'xxxx-xx-xxTxx:xx:xx.xxxxxx'.

The 6 means 'the next 6 lines'. The next argument is a Java regex. The following argument is a Java replacement string. As usual in Java regex, you can include '^' and '$' in the regex if you want to match the whole string.

tanclary commented 1 year ago

This sounds like a good idea, and would make tests like the ones in CALCITE-6021 more direct. The only question I have is how we could ensure things like timezones correctly meet expectations? Granted that the current tests don't do this either..

Maybe we could do something to check whether two tests have the appropriate amount of time between them?

For instance:

SELECT CURRENT_DATETIME('UTC') and SELECT CURRENT_DATETIME('America/Los_Angeles') should always return results that are different by 7 hours? Not sure if this is helpful but maybe there is a way to mask this as well?

julianhyde commented 1 year ago

Quidem's sweet spot is always going to be things that you can check using simple comparisons (think grep, diff, sed regular expressions). If you need a Turing-complete programming language to check the assertions, Quidem will be out of its league.

You could of course do SELECT CURRENT_DATETIME('UTC') - CURRENT_DATETIME('America/Los_Angeles') and make sure that the result is 7 or 8 hours.

julianhyde commented 1 year ago

Another area we should expand in Quidem is running queries with particular session parameters, e.g. set time zone to Pacific time. But that should be a different issue.