DeepLcom / sql-mock

A Python library to test your SQL models using mocked input data
MIT License
34 stars 5 forks source link

When a table is not expressed as alias the referencing isn't correct later #43

Closed fgarzadeleon closed 9 months ago

fgarzadeleon commented 9 months ago

Using a simple query:

QUERY_SIMPLE = """
SELECT col1
FROM bee as b 
JOIN a ON a.col1 = b.col1
"""

With models:

@table_meta(table_ref="a")
class aMock(ClickHouseTableMock):
    col1 = col.String(default="1")

@table_meta(table_ref="bee")
class bMock(ClickHouseTableMock):
    col1 = col.String(default="1")

@table_meta(query=QUERY_SIMPLE)
class BugTableMock(ClickHouseTableMock):
    col1 = col.String(default="1")
and test:
    def test_working():
    input_table_mock_1 = aMock()
    input_table_mock_2 = bMock()

    res = BugTableMock.from_mocks(input_data=[
        input_table_mock_1, input_table_mock_2])

    expected = [{'col1': '1'}]
    res.assert_equal(expected)
Yields error ` Cannot get JOIN keys from JOIN ON section: 'a.col1 = col1', found keys: [Left keys: [] Right keys [] Condition columns: 'equals(a.col1, col1)', '']. (INVALID_JOIN_ON_EXPRESSION) (version 23.8.4.69 (official build))`

Internal sql_mock cte is:
    WITH sql_mock__a AS (
  SELECT
    CAST('1' AS String) AS col1
  FROM (
    SELECT
      1
  )
  WHERE
    FALSE
), sql_mock__bee AS (
  SELECT
    CAST('1' AS String) AS col1
  FROM (
    SELECT
      1
  )
  WHERE
    FALSE
), result AS (
  SELECT
    col1
  FROM sql_mock__bee AS b /* bee */
  JOIN sql_mock__a /* a */
    ON a.col1 = b.col1
)
SELECT
  CAST(col1 AS String) AS col1
FROM result

Because table a does not have a AS a this fails. Because the table has now been replaced with slq_mock__a but it is still being referenced as a in a.col1.

thecasper2 commented 9 months ago

As far as I can tell this helper function is only designed to replace the table reference and not prefixed references to columns in that table, which wouldn't be necessary when using an alias anyway.

I can't readily think of a solution to this one, ideas welcome!

fgarzadeleon commented 9 months ago

Easiest would be to add documentation to always alias your tables but a fix that considers this in the package would be best.

Somtom commented 9 months ago

Thanks for filing that bug @fgarzadeleon! @thecasper2 I can have a look the upcoming days. I think the library we use (sqlglot) should be able to help us out there.

Somtom commented 9 months ago

@thecasper2 I got some response from Tobiko Data (the maintainers of sqlglot). Seems that replace_table is not a good tool for this type of job. They mentioned that the scope module could be interesting (there is also some explanation here)

Somtom commented 9 months ago

@thecasper2 @fgarzadeleon I might have found a solution: https://github.com/DeepLcom/sql-mock/pull/44