GoogleCloudPlatform / professional-services-data-validator

Utility to compare data between homogeneous or heterogeneous environments to ensure source and target tables match
Apache License 2.0
385 stars 108 forks source link

Error: Syntax error at or near "table_name" #1165

Open GabrieleCalarota opened 1 month ago

GabrieleCalarota commented 1 month ago

Hi 👋🏻

I'm running DVT from CLI on PostreSQL (v13.10) deployed on 2 GCP instances under private IPs with SSL enabled with config file like:

result_handler:
  project_id: my_project_id
  table_id: pso_data_validator.results
  type: BigQuery
source: source_conn
target: target_conn
validations:
- aggregates:
  - field_alias: count
    source_column: null
    target_column: null
    type: count
  filter_status: null
  filters: []
  format: table
  labels: []
  random_row_batch_size: null
  schema_name: public
  table_name: my_table_41
  target_schema_name: public
  target_table_name: my_table_41
  threshold: 0.0
  type: Column
  use_random_rows: false
  ...

I've successfully runner the tool for all tables except this one that I get this stacktrace (with --verbose and -ll DEBUG)


06/10/2024 12:23:23 PM-INFO: Currently running the validation for YAML file: dvt_parallel/generated_validations_chunk_207.yaml
06/10/2024 12:23:27 PM-DEBUG: Checking None for explicit credentials as part of auth process...
06/10/2024 12:23:27 PM-DEBUG: Checking Cloud SDK credentials as part of auth process...
/Users/gabrielecalarota/.pyenv/versions/3.11.5/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/__init__.py:422: SAWarning: Can't validate argument 'dialect_options'; can't locate any SQLAlchemy dialect named 'dialect'
  table = sa.Table(
/Users/gabrielecalarota/.pyenv/versions/3.11.5/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/__init__.py:442: SAWarning: Omitting index key for (user_id, submitted_at, task_id), key covers omitted columns.
  self.inspector.reflect_table(table, table.columns)
/Users/gabrielecalarota/.pyenv/versions/3.11.5/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/__init__.py:442: SAWarning: Can't validate argument 'dialect_options'; can't locate any SQLAlchemy dialect named 'dialect'
  self.inspector.reflect_table(table, table.columns)
06/10/2024 12:23:34 PM-ERROR: Error (psycopg2.errors.SyntaxError) syntax error at or near "tasks_20"
LINE 1: ...EMPORARY VIEW "11f7f2473f694bc8880251be1a50d21e" AS tasks_20
                                                               ^

[SQL: CREATE TEMPORARY VIEW "11f7f2473f694bc8880251be1a50d21e" AS tasks_20]
(Background on this error at: https://sqlalche.me/e/14/f405) occurred while running config file dvt_parallel/generated_validations_chunk_207.yaml. Skipping it for now.

I get the same error on the same table if I try with the non-partitioned table, what can it be?

nehanene15 commented 1 month ago

The error looks like it's trying to do a custom query validation, but your YAML config is a simple column count validation. What type of validation are you running?

It's hitting an error here in the _metadata() function which is used to get the data types from a custom query. Your error indicates that the custom query it is getting is "tasks_20" instead of a query like "select a,b,c, from ..."

Can you check what custom query you are providing as the SQL?