jaykelin / clickhouse-hdfs-loader

loading hdfs data to clickhouse
MIT License
73 stars 42 forks source link

代码报错问题 #2

Open matribots opened 6 years ago

matribots commented 6 years ago

AbstractClickhouseLoaderMapper.java文件中的这段代码里的hashString报错,提示说是少了个参数,请问该如何解决? if(StringUtils.isNotBlank(clickhouseDistributedTableShardingKeyValue)){ code = hashFn.hashString(clickhouseDistributedTableShardingKeyValue).asInt() & Integer.MAX_VALUE; }else{ code = hashFn.hashString(UUID.randomUUID().toString()).asInt() & Integer.MAX_VALUE; }

jaykelin commented 6 years ago

--table 指定是Distributed table吗?以及其建表脚本是怎样的?

matribots commented 6 years ago

我还没有用命令行。我是在IDEA上编译源码的时候遇到该报错问题的,是源码里少参数了。

jaykelin commented 6 years ago

跳过test试试

matribots commented 6 years ago

代码编译错误已解决。请问打包可执行jar包时,main class应该指定为哪一个(ClickhouseClientHolder、ClickhouseHdfsLoader中都有main方法)。

jaykelin commented 6 years ago

ClickhouseHdfsLoader

matribots commented 6 years ago

好的,谢谢,我试一下。

jaykelin commented 6 years ago

可以参考一下样例:

hadoop jar clickhouse-hdfs-loader.jar com.kugou.loader.clickhouse.ClickhouseHdfsLoader \
-Dmapreduce.job.queuename=root.default \
-i text \
--connect jdbc:clickhouse://xxx:8123/database \
--username xx --password xx \
--table distributed_table \
--dt 2018-01-01 \
--export-dir hdfs_path  \
--extract-hive-partitions true \
--exclude-fields 0,4,5,8 \
--daily false 
matribots commented 6 years ago

get,谢谢!

matribots commented 6 years ago

放到hadoop上跑了一下,Map任务显示的是100%,Reduce任务显示的是0%,最后报如下错误,ERROR clickhouse.ClickhouseHdfsLoader: Clickhouse Loader: ERROR! Failed records = 326663,这是什么情况..

jaykelin commented 6 years ago

分析map task的日志

jaykelin commented 6 years ago

@FangLietao 应该是漏掉了args4j的包

        <dependency>
            <groupId>args4j</groupId>
            <artifactId>args4j</artifactId>
            <version>2.33</version>
        </dependency>
jaykelin commented 6 years ago

@FangLietao 执行时是使用 -with-dependencies.jar 这个jar包吗?