GoogleCloudPlatform / professional-services-data-validator

Utility to compare data between homogeneous or heterogeneous environments to ensure source and target tables match
Apache License 2.0
399 stars 114 forks source link

Column name not exported with bq-result-handler when using custom-query column #1167

Closed jrdetorre-google closed 2 months ago

jrdetorre-google commented 3 months ago

Hi We are in the middle of a TD2BQ migration and we found it's hard to check the validation results of column custom-query validations exported to BigQuery because it's not exporting, nor the source_table_name, neither the target_table_name as it does with a standard column validation.

Just an easy validation like this shows the problem:

data-validation validate custom-query column \ --source-query-file query.sql \ --target-query-file query.sql \ -sc BQ_CONN \ -tc BQ_CONN --bq-result-handler $PROJECT_ID.data_validator.results

As shown in the attached image, the column name is perfectly exported but not the table name and it makes it hard to track the validations. dvt_cq_results

Is it a bug or it has to be this way and I don't get the reason why? Thanks in advance. Regards

nehanene15 commented 3 months ago

Hi, this is not a bug - for regular validations (row/column), users provide a -tbls flag that takes in the source and target schema and table name which is then populated in the BigQuery results. For custom query validation, users can provide a complex query and we don't require the -tbls flag since the query itself references the table name.

Because we don't accept the -tbls flag, those columns aren't populated in the results. That would require the tool to parse through the custom SQL to find the table names.

The workaround here would be to add the table names as labels which would then be populated in BQ.

jrdetorre-google commented 3 months ago

Understood, thanks