datahub-project / datahub

The Metadata Platform for your Data and AI Stack
https://datahubproject.io
Apache License 2.0
9.93k stars 2.94k forks source link

fix(ingest/sql): disable patch checker #11910

Closed hsheth2 closed 1 day ago

hsheth2 commented 1 day ago

When the pytest module is loaded, we would run sanity checks to validate that our patcher behaved properly. This was intended for testing and debugging only.

However, there are environments where pytest will be loaded for normal execution. Airflow and Dagster environments are particular examples, but there's likely others as well. In those environments, we'd see an error like this:

FileNotFoundError: [Errno 2] No such file or directory: 'patch'
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/dagster/_core/errors.py", line 287, in user_code_error_boundary
    yield
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/dagster/_grpc/server.py", line 255, in __init__
    loadable_targets = get_loadable_targets(
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/dagster/_grpc/utils.py", line 60, in get_loadable_targets
    else loadable_targets_from_python_package(package_name, working_directory)
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/dagster/_core/workspace/autodiscovery.py", line 44, in loadable_targets_from_python_package
    module = load_python_module(
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/dagster/_core/code_pointer.py", line 134, in load_python_module
    return importlib.import_module(module_name)
  File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/venvs/e9921b809296/lib/python3.8/site-packages/working_directory/root/./assets_modern_data_stack/__init__.py", line 14, in <module>
    from datahub_dagster_plugin.client.dagster_generator import DatahubDagsterSourceConfig
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/datahub_dagster_plugin/client/dagster_generator.py", line 16, in <module>
    from datahub.sql_parsing.sqlglot_utils import get_query_fingerprint
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/datahub/sql_parsing/sqlglot_utils.py", line 1, in <module>
    from datahub.sql_parsing._sqlglot_patch import SQLGLOT_PATCHED
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/datahub/sql_parsing/_sqlglot_patch.py", line 210, in <module>
    _patch_deepcopy()
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/datahub/sql_parsing/_sqlglot_patch.py", line 54, in _patch_deepcopy
    patchy.patch(
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/patchy/api.py", line 38, in patch
    _do_patch(func, patch_text, forwards=True)
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/patchy/api.py", line 103, in _do_patch
    new_source = _apply_patch(source, patch_text, forwards, func.__name__)
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/datahub/sql_parsing/_sqlglot_patch.py", line 36, in _new_apply_patch
    result_subprocess = _apply_diff_subprocess(source, patch_text, forwards, name)
  File "/venvs/1407f5f85ba1/lib/python3.8/site-packages/patchy/api.py", line 141, in _apply_patch
    result = subprocess.run(command, capture_output=True, text=True)
  File "/usr/local/lib/python3.8/subprocess.py", line 493, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/local/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/lib/python3.8/subprocess.py", line 1720, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)

Checklist