Closed lonely7345 closed 4 years ago
👍
提供的VM mirror里边的lineage 不能正常用,应该怎么搞?
The lineage in VM will only scan the job execution for the past 30 days, to see the demo for lineage, please go to Azkaban or HUE to launch some Pig, M/R or Hive jobs.
Then it will show up.
我们也在找数据血缘关系图的解析工具,我们hive SQL 比较复杂,有N个SQL 在一个hive 文件里面,还有各种特殊字符 jar 等等转义,支持力度 有相应的demo吗? 很期待这个项目
@ranqiqiang 你好,我现在也在学习wherehows,请问你解决了抽取血缘关系数据的问题吗?我当前的wherehows可以抽取hive、oracle、els的元数据,但是血缘关系信息抽取不到,很头痛。
@ericsun2 How can I get oracle lineage ?" please go to Azkaban or HUE to launch some Pig, M/R or Hive jobs" means wherehows cannot support oracle lineage ? Now I am using v1.0.0.
Starting 969dfb582e3c_wherehowsdocker_wherehows-mysql_1 ... error
ERROR: for 969dfb582e3c_wherehowsdocker_wherehows-mysql_1 Cannot start service wherehows-mysql: driver failed programming external connectivity on endpoint 969dfb582e3c_wherehowsdocker_wherehows-mysql_1 (d53a5b90e22403094cbf7f13a27f62c784a43f486901ea242348bf7eba6cb7ec): Error starting userland proxy: Bind for 0.0.0.0:3306 failed: port is already allocated
ERROR: for wherehows-mysql Cannot start service wherehows-mysql: driver failed programming external connectivity on endpoint 969dfb582e3c_wherehowsdocker_wherehows-mysql_1 (d53a5b90e22403094cbf7f13a27f62c784a43f486901ea242348bf7eba6cb7ec): Error starting userland proxy: Bind for 0.0.0.0:3306 failed: port is already allocated ERROR: Encountered errors while bringing up the project.
hadoop自带的lineage有人用吗??修改下然后也能用
@diaowenyang 抽取出来就行,血缘关系 我们通过任务关联
Dear issue owner,
Thanks for your interest in WhereHows. We have recently announced DataHub which is the rebranding of WhereHows. LinkedIn improved the architecture of WhereHows and rebranded WhereHows into DataHub and replaced its metadata infrastructure in this direction. DataHub is a more advanced and improved metadata management product compared to WhereHows.
Unfortunately, we have to stop supporting WhereHows to better focus on DataHub and offer more help to DataHub users. Therefore, we will drop all issues related to WhereHows and will not accept any contribution for it. Active development for DataHub has already started on datahub
branch and will continue to live in there until it's finally merged to master and project is renamed to DataHub.
Please check the datahub
branch to get familar with DataHub.
Best, DataHub team
很抱歉我直接使用中文,太多了,英语会比较慢,不过几位作者都是中国人,应该都能看懂
很高兴看到这个项目,现在正好去缺少元数据管理,原来在京东商城时有类似的项目,不过到新公司的事一直在找是否有开源的此类项目,没有找到。
一个是对hive元数据进行发现同步,能够建立起元数据知识库,包括修改历史,字段解释,问答,全局查询。 另外能够通过数据仓库的调度系统,与表关联起来,构造起元数据的血缘关联。这正是我们需要的。
因为现在最大的困境就是分析师不知道用哪个表哪个字段,还有就是数据修改后,不清楚关联的其他哪些表会受影响。
我们使用的是cloudera公司的 cdh oozie hue hive sqoop一整套方案,之前也一直研究过通过oozie的输入路径构建起表的依赖关系。如果是sqoop action,建立 起与关系数据库对应关系,然后再监控关系数据库的表,如果有变动就报警。 如果是hive action,则建立 起数据仓库表之间的对应关系。 如果是我们的数据推送datachange aciton,则建立 起仓库表和目标系统的对应关系 如果关系能够清晰可见确实对整个系统非常有帮助。看到wherehows,发现很多想法一样
再次感谢几位作者,很希望能够参与其中!