GoogleCloudPlatform / professional-services-data-validator

Utility to compare data between homogeneous or heterogeneous environments to ensure source and target tables match
Apache License 2.0
394 stars 109 forks source link

Tests: Add testing cases for `find-tables` feature to guarantee matching tables list #1043

Open helensilva14 opened 9 months ago

helensilva14 commented 9 months ago

Follow-up issue of #1017 and #1034 to create regression tests for this feature: https://github.com/GoogleCloudPlatform/professional-services-data-validator#building-matched-table-lists

helensilva14 commented 9 months ago

Differently from our validation workflow, this side feature only prints out the result and it's just called directly at the main method here: https://github.com/GoogleCloudPlatform/professional-services-data-validator/blob/develop/data_validation/__main__.py#L583

I don't know yet how we can perform this type of test, but this is something I tried to start on test_oracle.py but couldn't continue working on it. It can be used as reference for implementation in a new branch.

@mock.patch(
    "data_validation.state_manager.StateManager.get_connection_config",
    new=mock_get_connection_config,
)
def test_find_tables_matching_table_list_with_bigquery():
    """Oracle to BigQuery regression test to make sure the matching table list feature is working as intended"""
    parser = cli_tools.configure_arg_parser()
    args = parser.parse_args(
        [
            "find-tables",
            "-sc=mock-conn",
            "-tc=bq-conn",
            "--allowed-schemas=pso_data_validator",
        ]
    )
    config_managers = main.build_config_managers_from_args(args)
    assert len(config_managers) == 1

   # how to call the main method? that's how the feature is triggered, there's no specific class
   # previously reading: https://stackoverflow.com/a/75017499
    import importlib
    loader = importlib.machinery.SourceFileLoader("__main__", "data_validation/__main__.py")
    runpy_main = loader.load_module()
    assert runpy_main()
nj1973 commented 1 month ago

Note, I added tests for Oracle and PostgreSQL as part of https://github.com/GoogleCloudPlatform/professional-services-data-validator/issues/1194, I didn't see this issue. At least we now have a head start.

Also note there is an issue with SQL Server find-tables: https://github.com/GoogleCloudPlatform/professional-services-data-validator/issues/1198

helensilva14 commented 1 month ago

No prob at all, thanks a lot for your progress on this topic Neil!