Open heyromnivan opened 2 months ago
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io
0.13.1
It does look like an issue to me as this makes Vertica basically incompatible with any other metadata source. Even though Vertica itself doesn't allow multiple databases, it still has a database concept and external tools (dbt, BI tools) are all designed to take db name into account when constructing urns.
The only way I found is to make a custom source extending VerticaSource and overriding get_identifier method.
from datahub.ingestion.source.sql.vertica import VerticaSource, VerticaConfig
from vertica_sqlalchemy_dialect.base import VerticaInspector
@platform_name("Vertica")
@config_class(VerticaConfig)
# copy here all the decorators from the latest version of VerticaSource
class MyVerticaSource(VerticaSource):
def get_identifier(self, *, schema: str, entity: str, inspector: VerticaInspector, **kwargs) -> str:
db_name = self.get_db_name(inspector)
return f'{db_name}.{schema}.{entity}'
This can only be used with CLI ingestion which cannot be scheduled or run through DataHub UI, so it has to be automated with some external tool.
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io
with Vertica URNs don't contain database name
In my case I'm trying to build a joint lineage between Vertica and dbt, and they don't connect. If I understand correctly, it's because tables described by dbt have urn of
urn:li:dataPlatform:vertica,dbaname.schema.table
, but tables ingested from Vertica have urns ofurn:li:dataPlatform:vertica,schema.table
.Originally posted by @heyromnivan in https://github.com/datahub-project/datahub/issues/5483#issuecomment-2079250670