apache / linkis

Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
https://linkis.apache.org/
Apache License 2.0
3.31k stars 1.17k forks source link

[Question] Found checksum error: b[0, 0]= org.apache.hadoop.fs.ChecksumException: Checksum error #4876

Closed xujianlszj closed 1 year ago

xujianlszj commented 1 year ago

Before asking

Your environment

Describe your questions

Q1. linkis服务启动时,报首次上传资源失败, 第一个文件上传成功,第二个文件上传失败,报Found checksum error: b[0, 0]= org.apache.hadoop.fs.ChecksumException: Checksum error 根据QA文档已在hadoop的core-site.xml文件中增加配置

fs.hdfs.impl org.apache.hadoop.hdfs.DistributedFileSystem

image

Eureka service list

image

Some logs info or acctch file

linkis-ps-publicservice.log:

2023-08-28 10:22:30.028 [INFO ] [qtp851478032-26 ] o.a.l.b.s.i.ResourceServiceImpl (97) [upload] [JobId-] - hadoop uploaded a resource and resourceId is a4e4ff36-c30a-4aeb-bed2-aa515c9d018f 2023-08-28 10:22:30.033 [INFO ] [qtp851478032-26 ] o.a.l.b.s.i.TaskServiceImpl (98) [createUploadTask] [JobId-] - Upload resource successfully. Update task(上传资源成功.更新任务) taskId:1-resourceId:a4e4ff36-c30a-4aeb-bed2-aa515c9d018f status is success . 2023-08-28 10:22:30.036 [INFO ] [qtp851478032-26 ] o.a.l.b.r.BmlRestfulApi (685) [uploadResource] [JobId-] - User hadoop submitted upload resource task successfully(用户 hadoop 提交上传资源任务成功, resourceId is a4e4ff36-c30a-4aeb-bed2-aa515c9d018f) 2023-08-28 10:22:32.745 [INFO ] [qtp851478032-22 ] o.a.l.s.u.ModuleUserUtils (68) [getProxyUserEntity] [JobId-] - user hadoop proxy to null operation uploadResource 2023-08-28 10:22:32.745 [INFO ] [qtp851478032-22 ] o.a.l.b.r.BmlRestfulApi (667) [uploadResource] [JobId-] - User hadoop starts uploading resources (用户 hadoop 开始上传资源) 2023-08-28 10:22:32.749 [INFO ] [qtp851478032-22 ] o.a.l.b.s.i.TaskServiceImpl (82) [createUploadTask] [JobId-] - Upload task information was successfully saved (成功保存上传任务信息).taskId:2,resourceTask:ResourceTask{id=2, resourceId='702ed8df-0505-4c8d-bcbc-ae13166d6450', version='v000001', operation='upload', state='scheduled', submitUser='hadoop', system='dss', instance='localhost:9105', clientIp='127.0.0.1', errMsg='null', extraParams='null', startTime=Mon Aug 28 10:22:32 CST 2023, endTime=null, lastUpdateTime=Mon Aug 28 10:22:32 CST 2023} 2023-08-28 10:22:32.751 [INFO ] [qtp851478032-22 ] o.a.l.b.s.i.TaskServiceImpl (87) [createUploadTask] [JobId-] - Successful update task (成功更新任务 ) taskId:2-resourceId:702ed8df-0505-4c8d-bcbc-ae13166d6450 status is running . 2023-08-28 10:22:32.751 [INFO ] [qtp851478032-22 ] o.a.l.b.c.ResourceHelperFactory (37) [getResourceHelper] [JobId-] - will store resource in hdfs 2023-08-28 10:22:32.765 [INFO ] [qtp851478032-22 ] o.a.l.s.f.FileSystem (102) [getParentPath] [JobId-] - Get Parent Path:/tmp/linkis/hadoop/bml/20230828 2023-08-28 10:22:32.793 [INFO ] [qtp851478032-22 ] o.a.l.s.u.FileSystemUtils (94) [createNewFileWithFileSystem] [JobId-] - doesn't need to call setOwner 2023-08-28 10:22:33.564 [INFO ] [qtp851478032-22 ] o.a.h.f.FSInputChecker (309) [readChecksumChunk] [JobId-] - Found checksum error: b[0, 0]= org.apache.hadoop.fs.ChecksumException: Checksum error: /tmp/linkis/hadoop/bml/20230828/702ed8df-0505-4c8d-bcbc-ae13166d6450 at 96992256

Caused by: org.apache.hadoop.fs.ChecksumException: Checksum error: /tmp/linkis/hadoop/bml/20230828/702ed8df-0505-4c8d-bcbc-ae13166d6450 at 96992256 log file:

linkis-ps-publicservice.log

github-actions[bot] commented 1 year ago

:blush: Welcome to the Apache Linkis community!!

We are glad that you are contributing by opening this issue.

Please make sure to include all the relevant context. We will be here shortly.

If you are interested in contributing to our website project, please let us know! You can check out our contributing guide on :point_right: How to Participate in Project Contribution.

Community

WeChat Assistant WeChat Public Account

Mailing Lists

Name Description Subscribe Unsubscribe Archive
dev@linkis.apache.org community activity information subscribe unsubscribe archive
aiceflower commented 1 year ago

You can check whether your hdfs service is normal, and then check whether the bml package is complete. Or delete the bml package and restart the service.