jaykelin / clickhouse-hdfs-loader

loading hdfs data to clickhouse
MIT License
73 stars 42 forks source link

导入数据报错 #5

Open wpp0525 opened 5 years ago

wpp0525 commented 5 years ago

hive table describe

CREATE TABLE lmmtmp.test_user( user_id bigint, user_name varchar(60), created_date string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' )

CH table describe

CREATE TABLE lmmtmp.test_user ( user_id Int64, user_name Nullable(String), created_date Date) ENGINE = MergeTree(created_date, user_id, 8192) │

执行命令如下: [hadoop@crm-master1 target]$ [hadoop@crm-master1 target]$ hadoop jar clickhouse-hdfs-loader-2.0.3-jar-with-dependencies.jar com.kugou.loader.clickhouse.ClickhouseHdfsLoader \

-Dmapreduce.job.queuename=root.default \ -i orc \ --connect jdbc:clickhouse://10.113.1.18:8123/lmmtmp \ --table test_user \ --dt 2009-09-23 \ --export-dir /user/hive/warehouse/lmmtmp.db/test_user \ --extract-hive-partitions true \

错误信息

18/11/05 17:37:52 INFO clickhouse.ClickhouseClientHolder: Clickhouse Loader : get clickhouse client[jdbc:clickhouse://10.113.1.18:8123/lmmtmp] for user=null 18/11/05 17:37:52 INFO clickhouse.ClickHouseDriver: Driver registered 18/11/05 17:37:52 INFO clickhouse.ClickHouseDriver: Creating connection 18/11/05 17:37:53 INFO clickhouse.ClickhouseHdfsLoader: parse and loading parameter: 18/11/05 17:37:53 INFO clickhouse.ClickhouseHdfsLoader: CLI_P_TABLE -> test_user 18/11/05 17:37:53 INFO clickhouse.ClickhouseHdfsLoader: CLI_P_DATABASE -> lmmtmp 18/11/05 17:37:53 INFO clickhouse.ClickhouseHdfsLoader: CL_TARGET_TABLE_FULLNAME -> lmmtmp.test_user 18/11/05 17:37:53 INFO clickhouse.ClickhouseHdfsLoader: Clickhouse Loader : load data to table[lmmtmp.test_user]. 18/11/05 17:37:53 INFO clickhouse.ClickhouseHdfsLoader: Clickhouse Loader: loading data to clickhouse with direct[true] 18/11/05 17:37:57 INFO input.FileInputFormat: Total input paths to process : 1 18/11/05 17:37:58 INFO mapreduce.JobSubmitter: number of splits:1 18/11/05 17:37:58 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1539850326169_0784 18/11/05 17:37:59 INFO impl.YarnClientImpl: Submitted application application_1539850326169_0784 18/11/05 17:37:59 INFO mapreduce.Job: The url to track the job: http://crm-master1:8088/proxy/application_1539850326169_0784/ 18/11/05 17:37:59 INFO mapreduce.Job: Running job: job_1539850326169_0784 18/11/05 17:38:03 INFO mapreduce.Job: Job job_1539850326169_0784 running in uber mode : false 18/11/05 17:38:03 INFO mapreduce.Job: map 0% reduce 0% 18/11/05 17:38:18 INFO mapreduce.Job: map 100% reduce 0% 18/11/05 17:40:10 INFO mapreduce.Job: Job job_1539850326169_0784 completed successfully 18/11/05 17:40:11 INFO mapreduce.Job: Counters: 35 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=130284 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=9201 HDFS: Number of bytes written=0 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Launched map tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=489956 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=122489 Total vcore-milliseconds taken by all map tasks=244978 Total megabyte-milliseconds taken by all map tasks=501714944 Map-Reduce Framework Map input records=1000 Map output records=0 Input split bytes=135 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=48 CPU time spent (ms)=2180 Physical memory (bytes) snapshot=345632768 Virtual memory (bytes) snapshot=5310775296 Total committed heap usage (bytes)=467664896 Peak Map Physical memory (bytes)=345632768 Peak Map Virtual memory (bytes)=5310775296 Clickhouse Loader Counters Distributed local hosts=1 Failed records=1000 Target table columns=3 File Input Format Counters Bytes Read=9066 File Output Format Counters Bytes Written=0 18/11/05 17:40:11 ERROR clickhouse.ClickhouseHdfsLoader: Clickhouse Loader: ERROR! Failed records = 1000

jaykelin commented 5 years ago

应该在Map Task的日志中有报错信息。 另外clickhouse的表,你可以创建一个Distributed的表,再试试。