datahub-project / datahub

The Metadata Platform for your Data Stack
https://datahubproject.io
Apache License 2.0
9.81k stars 2.89k forks source link

how to load hive data to wherehows with master branch #734

Closed diaowenyang closed 6 years ago

diaowenyang commented 7 years ago

Hello: This problem has been troubled me for a long time, I originally downloaded is the master branch, build, page display, landing and so no problem, until the allocation of regular tasks out of the question. In accordance with the requirements, I configured HIVE_METADATA_ETL.job,

job.class=metadata.etl.dataset.hive.HiveMetadataEtl job.cron.expr=0/60 ? * job.timeout=12000 job.ref.id=65

hive.metastore.jdbc.url=jdbc:mysql://127.0.0.1:3306/metastore hive.metastore.jdbc.driver=com.mysql.jdbc.Driver hive.metastore.username=hive hive.metastore.password=hive hive.database_black_list=your_databsae_black_list hive.database_white_list=your_database_white_list

But I do not know how the following parameter values are obtained: hive.schema_csv_file=/var/tmp/hive_schema.csv hive.schema_json_file=/var/tmp/hive_schema.json hive.field_metadata=/var/tmp/hive_field_metadata.csv hive.hdfs_map_csv_file=/var/tmp/hive_hdfs_map.csv hive.instance_csv_file=/var/tmp/hive_instance.csv hive.dependency_csv_file=/var/tmp/hive_dependency.csv

These are not specified in the configuration ETLjobs page I found these three python scripts https://github.com/linkedin/WhereHows/wiki/Hive-Dataset ./wherehows-etl/build/resources/main/jython/HiveExtract.py ./wherehows-etl/build/resources/main/jython/HiveLoad.py ./wherehows-etl/build/resources/main/jython/HiveTransform.py

I have deployed jython for this,and use Execute order: java -jar $JYTHON/jython.jar ./wherehows-etl/src/main/resources/jython/HiveExtract.py

Command error: Traceback (most recent call last): File "/home/wherehows/gitClone/WhereHows/wherehows-etl/src/main/resources/jython/HiveExtract.py", line 22, in from org.slf4j import LoggerFactory ImportError: No module named slf4j

I have no idea to resolve the problem, and I choose branch : v0.2.1 When I build it ,it goes [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/org.codehaus.plexus/plexus-utils/3.0.17/ivys/ivy.xml [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.typesafe.com/typesafe/ivy-releases/org.codehaus.plexus/plexus/3.3.1/ivys/ivy.xml [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/org.codehaus.plexus/plexus/3.3.1/ivys/ivy.xml [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.typesafe.com/typesafe/ivy-releases/org.sonatype.forge/forge-parent/10/ivys/ivy.xml [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/org.sonatype.forge/forge-parent/10/ivys/ivy.xml [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.typesafe.com/typesafe/ivy-releases/org.sonatype.forge/forge-parent/10/jars/forge-parent.jar [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.typesafe.com/typesafe/releases/org/sonatype/forge/forge-parent/10/forge-parent-10.jar [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.typesafe.com/typesafe/ivy-releases/org.sonatype.spice/spice-parent/17/jars/spice-parent.jar [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.typesafe.com/typesafe/releases/org/sonatype/spice/spice-parent/17/spice-parent-17.jar [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.typesafe.com/typesafe/ivy-releases/org.sonatype.spice/spice-parent/17/jars/spice-parent.jar [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.typesafe.com/typesafe/releases/org/codehaus/plexus/plexus/3.3.1/plexus-3.3.1.jar [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.typesafe.com/typesafe/ivy-releases/org.codehaus.plexus/plexus/3.3.1/jars/plexus.jar [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/org.apache.maven/maven-settings/3.2.2/ivys/ivy.xml [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.typesafe.com/typesafe/ivy-releases/org.apache.maven/maven-settings/3.2.2/jars/maven-settings.jar [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/org.apache.maven/maven-settings-builder/3.2.2/ivys/ivy.xml [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/org.apache.maven/maven-settings-builder/3.2.2/jars/maven-settings-builder.jar [error] Server access Error: Connection timed out (Connection timed out) url=https://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/org.codehaus.plexus/plexus-components/1.3.1/ivys/ivy.xml

Can you appreciate my despair? Help me Please!

sagartoms commented 6 years ago

@diaowenyang How did you solved this?

diaowenyang commented 6 years ago

I used v1.0.0. It can load hive or oracle easily.

sagartoms commented 6 years ago

@diaowenyang Thanks..

diaowenyang commented 6 years ago

If you have any questions,you can contact me diaowenyang@yahoo.com . I will help.

LeeMo2K10 commented 6 years ago

@diaowenyang 可以加q群: 685507868