Open capoolebugchat opened 8 months ago
@capoolebugchat are you interested in picking up this issue?
@TeddyCr I would love to help.
@TeddyCr yes, sorry about the extra late reply
Thanks @capoolebugchat I'll assign it to you then. We have some information about how to build a new connector here. Make sure to join our slack channel and the #contributor channel for any help.
@rogercezidio please check other connectors here for contributing we have many. 😊
Hi folks,
we did a simple implementation for a Dremio custom connector here: https://github.com/TIKI-Institut/openmetadata-dremio-connector. It can only scrap Metadata. It has no support for Query Usage, Profiling etc. We didn't find a possibility to implement that for a custom connector. For lineage we are simple using DBT at the moment.
We would appreciate any feedback. Is it possible that this will be integrated into OpenMetadata?
Hey @wobu we would recommend you to directly contribute the connector to the community. This will allow you to leverage the existing code base to implement support for Usage, Profiling, etc.
Here is a link with more information -> https://docs.open-metadata.org/latest/developers/contribute/developing-a-new-connector
We thought about it, and also tried it, but unfortunatley setting up the openmetadata project under windows OS wasn't easy :/ (WSL would maybe an option). So we decided to just start with a custom connector.
The CustomConnector is currently sufficient for us, so we won't provide a direct community integration in the near future until our investment in Openmetadata and Dremio increases.
I'm finding a way for the OMD project to rule over every tool's metadata and sort of monitors an extra-compact DataPlatform, this DP uses Dremio as its query execution engine for scalability with bigger datasets and ease of use (it connects well to a lot of data sources). However, OMD hasn't the connector to Dremio for metadata extract and monitoring.
Solution: A Dremio connector to OMD, which can be easily configured through minimal variables like host:port and usrn:pasw, ssl is a nice addon feature but is not essential for now.
Alternative: An external data cataloguing service like HiveDC, DynamoDB, Nessie (Dremio recommended),... Both OMD and Dremio uses this as Metadata monitor and tracking tool. However this exclude Dremio from OMD and bloats the infra a bit (another data solution to take care of).
I'm new to this Cloud Data Engineering thing, a bit suprised about how limited Dremio is, though the engine is still quite powerful.