Closed sebaap closed 1 year ago
If you deployed datahub-actions using the helm chart you need to change the value of datahubDefaultCli
to 0.10.2(.1)
Yep you’ll need cli version 0.10.2.1 for the fix.
I have the similar error when I use redshift as source
Traceback (most recent call last):
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/datahub/entrypoints.py", line 182, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 379, in wrapper
raise e
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 334, in wrapper
res = func(*args, **kwargs)
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/datahub/utilities/memory_leak_detector.py", line 95, in wrapper
return func(ctx, *args, **kwargs)
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 187, in run
pipeline = Pipeline.create(
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 328, in create
return cls(
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 211, in __init__
with _add_init_error_context(
File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
self.gen.throw(typ, value, traceback)
File "/tmp/datahub/ingest/venv-redshift-0.10.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 121, in _add_init_error_context
raise PipelineInitError(f"Failed to {step}: {e}") from e
datahub.ingestion.run.pipeline.PipelineInitError: Failed to find a registered source for type redshift: 'str' object is not callable
I use this command "datahub docker quickstart" to deploy datahub
DataHub version: v0.10.2 DataHub CLI version: 0.10.2.1 Python version: 3.8.15
@aNull404 looks like your datahubDefaultCli
is still set to 0.10.2 instead of 0.10.2.1. You can run with the newer version as per the docs here https://datahubproject.io/docs/ui-ingestion/#advanced-running-with-a-specific-cli-version
@aNull404 looks like your
datahubDefaultCli
is still set to 0.10.2 instead of 0.10.2.1. You can run with the newer version as per the docs here https://datahubproject.io/docs/ui-ingestion/#advanced-running-with-a-specific-cli-version
another error occured when I set cli to 0.10.2.1
Traceback (most recent call last):
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/entrypoints.py", line 182, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 379, in wrapper
raise e
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 334, in wrapper
res = func(*args, **kwargs)
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/utilities/memory_leak_detector.py", line 95, in wrapper
return func(ctx, *args, **kwargs)
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 198, in run
loop.run_until_complete(run_func_check_upgrade(pipeline))
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 158, in run_func_check_upgrade
ret = await the_one_future
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 149, in run_pipeline_async
return await loop.run_in_executor(
File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 140, in run_pipeline_to_completion
raise e
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 132, in run_pipeline_to_completion
pipeline.run()
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 359, in run
for wu in itertools.islice(
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/utilities/source_helpers.py", line 104, in auto_stale_entity_removal
yield from stale_entity_removal_handler.gen_removed_entity_workunits()
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/ingestion/source/state/stale_entity_removal_handler.py", line 267, in gen_removed_entity_workunits
last_checkpoint: Optional[Checkpoint] = self.source.get_last_checkpoint(
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/ingestion/source/state/stateful_ingestion_base.py", line 320, in get_last_checkpoint
self.last_checkpoints[job_id] = self._get_last_checkpoint(
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/ingestion/source/state/stateful_ingestion_base.py", line 295, in _get_last_checkpoint
self.ingestion_checkpointing_state_provider.get_latest_checkpoint(
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/ingestion/source/state_provider/datahub_ingestion_checkpointing_provider.py", line 76, in get_latest_checkpoint
] = self.graph.get_latest_timeseries_value(
File "/tmp/datahub/ingest/venv-redshift-0.10.2.1/lib/python3.10/site-packages/datahub/ingestion/graph/client.py", line 299, in get_latest_timeseries_value
assert len(values) == 1
AssertionError
uncheck this option "Enable Stateful Ingestion", the ingesion can execute successful, but other unrelated databases/schemas were ingested to datahub at the same time
great, thanks everyone. Confirming that changing the default CLI value to 0.10.2.1 fixed the issue. For anyone having the same problem the value in the helm chart is global.datahub.managed_ingestion.defaultCliVersion
I'm facing this issue that was fixed last week https://github.com/datahub-project/datahub/issues/7855, but datahub-actions seems to still be using
acryl-datahub 0.10.0
.I couldn't find where this is being pinned. Happy to submit a PR if someone can point me to the right direction. Also, maybe good to relax the constrain to
~=
so patch versions get updated automatically?I'm using
acryldata/datahub-actions:v0.0.12