GoogleCloudPlatform / zetasql-toolkit

The ZetaSQL Toolkit is a library that helps users use ZetaSQL Java API to perform SQL analysis for multiple query engines, including BigQuery and Cloud Spanner.
Apache License 2.0
39 stars 10 forks source link

Remove quoting check for TVFs and Procedures during analysis #79

Open ppaglilla opened 4 months ago

ppaglilla commented 4 months ago

In the past, we needed procedure and TVF references to be fully quoted in a query for catalog resolution to work. To that end, we added this check to the analyzer that would throw an exception if it found a procedure or TVF which wasn't fully quoted.

Since then, we've updated the analyzer to rewrite queries and fully quote all resources before analysis. This makes the original check redundant and it should be removed.

TunahanOcal commented 4 months ago

Hello, I have a comment for this part. Sometimes rewritten queries are quoted incorrectly. Because of this, the analyzer throws a syntax error. Part of Original Query:

  round(sam.amountt, 2) as refundCashAmount,
  round(sam.amountt, 2) as price,
  ifnull(stl.commission_rate, mc.commission_rate) as commission,

Same part of Rewritten Query.

  r`round`sam.amountt, 2) as refundCashAmount,
  r`round`sam.amountt, 2) as price,
  i`ifnull`stl.commission_rate, mc.commission_rate) as commission,

I debugged the code, but I could not understand the reason of this bug. I think it is writing the quoted string to the wrong position. But I am not sure.

Thanks in advance.

ppaglilla commented 4 months ago

Thanks for pointing this out!

Can you share a full query that has this issue? I'm having trouble reproducing it

ppaglilla commented 4 months ago

Removed the problematic checks this issue refers to. I'm keeping the issue open to see about the quoting issue mentioned.

@TunahanOcal I haven't been able to reproduce the behavior you're mentioning. It'd be awesome if you can share a whole query with the problem and I can help debug it.

Here's an example I put together using your fragment that works as expected.

-- Original
WITH sam AS (
  SELECT 1 AS amountt,
  CAST(NULL AS FLOAT64) AS commission_rate1,
  0.05 commission_rate2
)
SELECT
  round(sam.amountt, 2) as refundCashAmount,
  round(sam.amountt, 2) as price,
  ifnull(sam.commission_rate1, sam.commission_rate2) as commission
FROM sam;

-- Requoted
WITH sam AS (
  SELECT 1 AS amountt,
  CAST(NULL AS FLOAT64) AS commission_rate1,
  0.05 commission_rate2
)
SELECT
  `round`(sam.amountt, 2) as refundCashAmount,
  `round`(sam.amountt, 2) as price,
  `ifnull`(sam.commission_rate1, sam.commission_rate2) as commission
FROM `sam`;
TunahanOcal commented 3 months ago

Hello again, I found the reason. Non-ASCII characters are the cause of this bug. In my case; When a query contains string literal that includes Turkish characters, parse location ranges are affected. In UTF-8 encoding, they require multi-byte representation. I'll try to find a workaround for this.