apache / iotdb

Apache IoTDB
https://iotdb.apache.org/
Apache License 2.0
5.61k stars 1.02k forks source link

WARN o.a.i.d.w.r.TsFileRecoverPerformer:171 - Cannot deserialize TsFileResource ... construct it using TsFileSequenceReader #7951

Open veken-v opened 2 years ago

veken-v commented 2 years ago

我们的系统很经常断电(相当于直接拔掉电源)。当插上电源时会自动重启并恢复系统

我们是使用 docker 部署 , iotdb 镜像是 0.13.2-node 就部署一个节点。

在一次恢复中,发现了如下异常


2022-11-09 14:54:26,705 [pool-10-IoTDB-Recovery-Thread-Pool-1] WARN  o.a.i.d.w.r.TsFileRecoverPerformer:171 - Cannot deserialize TsFileResource /iotdb/sbin/../data/data/unsequence/root.tbt/0/0/1667973720009-30-0-0.tsfile, construct it using TsFileSequenceReader 
java.io.IOException: Intend to read 4 bytes but -1 are actually returned
    at org.apache.iotdb.tsfile.utils.ReadWriteIOUtils.readInt(ReadWriteIOUtils.java:512)
    at org.apache.iotdb.db.engine.storagegroup.timeindex.V012FileTimeIndex.deserialize(V012FileTimeIndex.java:46)
    at org.apache.iotdb.db.engine.storagegroup.timeindex.V012FileTimeIndex.deserialize(V012FileTimeIndex.java:33)
    at org.apache.iotdb.db.engine.storagegroup.TsFileResource.deserialize(TsFileResource.java:259)
    at org.apache.iotdb.db.writelog.recover.TsFileRecoverPerformer.recoverResourceFromFile(TsFileRecoverPerformer.java:169)
    at org.apache.iotdb.db.writelog.recover.TsFileRecoverPerformer.recover(TsFileRecoverPerformer.java:108)
    at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.recoverTsFiles(VirtualStorageGroupProcessor.java:785)
    at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.recover(VirtualStorageGroupProcessor.java:529)
    at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.<init>(VirtualStorageGroupProcessor.java:404)
    at org.apache.iotdb.db.engine.StorageEngine.buildNewStorageGroupProcessor(StorageEngine.java:784)
    at org.apache.iotdb.db.engine.storagegroup.virtualSg.StorageGroupManager.lambda$asyncRecover$0(StorageGroupManager.java:245)
    at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
2022-11-09 14:54:26,707 [main] INFO  o.a.i.d.c.IoTDBThreadPoolFactory:232 - new SynchronousQueue thread pool: RPC-Client 
2022-11-09 14:54:26,708 [RPC] INFO  o.a.i.d.s.t.ThriftServiceThread:241 - The RPC ServerService service thread begin to run... 
2022-11-09 14:54:26,710 [Thread-1] ERROR o.a.i.d.c.IoTDBDefaultThreadExceptionHandler:31 - Exception in thread Thread-1-20 
org.apache.iotdb.db.exception.runtime.StorageEngineFailureException: StorageEngine failed to recover.
    at org.apache.iotdb.db.engine.StorageEngine.lambda$recover$1(StorageEngine.java:410)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Negative position
    at java.base/java.util.concurrent.FutureTask.report(Unknown Source)
    at java.base/java.util.concurrent.FutureTask.get(Unknown Source)
    at org.apache.iotdb.db.engine.StorageEngine.lambda$recover$1(StorageEngine.java:408)
    ... 1 common frames omitted
Caused by: java.lang.IllegalArgumentException: Negative position
    at java.base/sun.nio.ch.FileChannelImpl.read(Unknown Source)
    at org.apache.iotdb.tsfile.read.reader.LocalTsFileInput.read(LocalTsFileInput.java:90)
    at org.apache.iotdb.tsfile.read.TsFileSequenceReader.readTailMagic(TsFileSequenceReader.java:229)
    at org.apache.iotdb.tsfile.read.TsFileSequenceReader.loadMetadataSize(TsFileSequenceReader.java:202)
    at org.apache.iotdb.tsfile.read.TsFileSequenceReader.<init>(TsFileSequenceReader.java:139)
    at org.apache.iotdb.db.writelog.recover.TsFileRecoverPerformer.recoverResourceFromReader(TsFileRecoverPerformer.java:181)
    at org.apache.iotdb.db.writelog.recover.TsFileRecoverPerformer.recoverResourceFromFile(TsFileRecoverPerformer.java:175)
    at org.apache.iotdb.db.writelog.recover.TsFileRecoverPerformer.recover(TsFileRecoverPerformer.java:108)
    at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.recoverTsFiles(VirtualStorageGroupProcessor.java:785)
    at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.recover(VirtualStorageGroupProcessor.java:529)
    at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.<init>(VirtualStorageGroupProcessor.java:404)
    at org.apache.iotdb.db.engine.StorageEngine.buildNewStorageGroupProcessor(StorageEngine.java:784)
    at org.apache.iotdb.db.engine.storagegroup.virtualSg.StorageGroupManager.lambda$asyncRecover$0(StorageGroupManager.java:245)
    at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    ... 1 common frames omitted
2022-11-09 14:54:26,808 [main] INFO  o.a.i.d.s.t.ThriftService:135 - IoTDB: start RPC ServerService successfully, listening on ip 0.0.0.0 port 6667 
github-actions[bot] commented 2 years ago

Hi, this is your first issue in IoTDB project. Thanks for your report. Welcome to join the community!

HTHou commented 2 years ago

这个应该是 /data/data/unsequence/root.tbt/0/0/1667973720009-30-0-0.tsfile.resource文件在生成时系统断电造成resource文件损坏导致的,不会影响正常使用。 在重启时IoTDB会进行文件自检,发现损坏的resource文件后会从 tsfile 里恢复和修复出正常的 resource文件。这个 WARN 日志就是在说这个过程,所以不是ERROR。再次重启后应该不会再次出现这个日志。

这里其实可以不打堆栈信息,仅打印一些提示信息即可。

veken-v commented 2 years ago

此时使用客户查询数据如:

select LAST MonthCompRate from root.tbt.71a97ba0404011ecb05821ee402a74e0

就会出现异常:

Caused by: org.apache.iotdb.rpc.StatementExecutionException: 317: the sg root.tbt may not ready now, please wait and retry later at org.apache.iotdb.rpc.RpcUtils.verifySuccess(RpcUtils.java:98) at org.apache.iotdb.rpc.RpcUtils.verifySuccessWithRedirection(RpcUtils.java:119) at org.apache.iotdb.session.SessionConnection.verifySuccessWithRedirectionWrapper(SessionConnection.java:528) at org.apache.iotdb.session.SessionConnection.executeQueryStatement(SessionConnection.java:369) at org.apache.iotdb.session.Session.executeStatementMayRedirect(Session.java:653) at org.apache.iotdb.session.Session.executeQueryStatement(Session.java:625) at org.apache.iotdb.session.pool.SessionPool.executeQueryStatement(SessionPool.java:2207) at org.myapp.server.dao.timeseries.IoTDBBaseTimeSeriesLatestDao.findLatest(IoTDBBaseTimeSeriesLatestDao.java:48)

客户端版本:

org.apache.iotdb iotdb-session 0.13.1
HTHou commented 2 years ago

此时使用客户查询数据如:

select LAST MonthCompRate from root.tbt.71a97ba0404011ecb05821ee402a74e0

就会出现异常:

Caused by: org.apache.iotdb.rpc.StatementExecutionException: 317: the sg root.tbt may not ready now, please wait and retry later at org.apache.iotdb.rpc.RpcUtils.verifySuccess(RpcUtils.java:98) at org.apache.iotdb.rpc.RpcUtils.verifySuccessWithRedirection(RpcUtils.java:119) at org.apache.iotdb.session.SessionConnection.verifySuccessWithRedirectionWrapper(SessionConnection.java:528) at org.apache.iotdb.session.SessionConnection.executeQueryStatement(SessionConnection.java:369) at org.apache.iotdb.session.Session.executeStatementMayRedirect(Session.java:653) at org.apache.iotdb.session.Session.executeQueryStatement(Session.java:625) at org.apache.iotdb.session.pool.SessionPool.executeQueryStatement(SessionPool.java:2207) at org.myapp.server.dao.timeseries.IoTDBBaseTimeSeriesLatestDao.findLatest(IoTDBBaseTimeSeriesLatestDao.java:48)

客户端版本: org.apache.iotdb iotdb-session 0.13.1

may not ready now 也不是错误,这是iotdb进行重启自检,这个存储组还没有自检完成。可以等待一段时间再进行查询。

veken-v commented 1 year ago

其实 iotdb 一直没有恢复过来, 导致客户端操作一直提示 the sg root.tbwf may not ready now, please wait and retry

2022-11-11 17:52:10,215 [main] INFO o.a.i.d.c.IoTDBThreadPoolFactory:232 - new SynchronousQueue thread pool: RPC-Client 2022-11-11 17:52:10,217 [RPC] INFO o.a.i.d.s.t.ThriftServiceThread:241 - The RPC ServerService service thread begin to run... 2022-11-11 17:52:10,216 [pool-10-IoTDB-Recovery-Thread-Pool-1] WARN o.a.i.d.w.r.TsFileRecoverPerformer:171 - Cannot deserialize TsFileResource /iotdb/sbin/../data/data/unsequence/root.tbwf/0/0/1668154932126-488-0-0.tsfile, construct it using TsFileSequenceReader java.io.IOException: Intend to read 4 bytes but -1 are actually returned at org.apache.iotdb.tsfile.utils.ReadWriteIOUtils.readInt(ReadWriteIOUtils.java:512) at org.apache.iotdb.db.engine.storagegroup.timeindex.V012FileTimeIndex.deserialize(V012FileTimeIndex.java:46) at org.apache.iotdb.db.engine.storagegroup.timeindex.V012FileTimeIndex.deserialize(V012FileTimeIndex.java:33) at org.apache.iotdb.db.engine.storagegroup.TsFileResource.deserialize(TsFileResource.java:259) at org.apache.iotdb.db.writelog.recover.TsFileRecoverPerformer.recoverResourceFromFile(TsFileRecoverPerformer.java:169) at org.apache.iotdb.db.writelog.recover.TsFileRecoverPerformer.recover(TsFileRecoverPerformer.java:108) at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.recoverTsFiles(VirtualStorageGroupProcessor.java:785) at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.recover(VirtualStorageGroupProcessor.java:529) at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.(VirtualStorageGroupProcessor.java:404) at org.apache.iotdb.db.engine.StorageEngine.buildNewStorageGroupProcessor(StorageEngine.java:784) at org.apache.iotdb.db.engine.storagegroup.virtualSg.StorageGroupManager.lambda$asyncRecover$0(StorageGroupManager.java:245) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) 2022-11-11 17:52:10,218 [pool-10-IoTDB-Recovery-Thread-Pool-2] INFO o.a.i.d.e.v.SimpleFileVersionController:113 - Version file updated, previous: /iotdb/sbin/../data/system/storage_groups/root.tb/0/upgrade/Version-5400, current: /iotdb/sbin/../data/system/storage_groups/root.tb/0/upgrade/Version-5500 2022-11-11 17:52:10,223 [pool-10-IoTDB-Recovery-Thread-Pool-2] INFO o.a.i.d.c.IoTDBThreadPoolFactory:174 - new single scheduled thread pool: Compaction_Schedule-root.tb-0 2022-11-11 17:52:10,224 [Thread-1] ERROR o.a.i.d.c.IoTDBDefaultThreadExceptionHandler:31 - Exception in thread Thread-1-21 org.apache.iotdb.db.exception.runtime.StorageEngineFailureException: StorageEngine failed to recover. at org.apache.iotdb.db.engine.StorageEngine.lambda$recover$1(StorageEngine.java:410) at java.base/java.lang.Thread.run(Unknown Source) Caused by: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Negative position at java.base/java.util.concurrent.FutureTask.report(Unknown Source) at java.base/java.util.concurrent.FutureTask.get(Unknown Source) at org.apache.iotdb.db.engine.StorageEngine.lambda$recover$1(StorageEngine.java:408) ... 1 common frames omitted Caused by: java.lang.IllegalArgumentException: Negative position at java.base/sun.nio.ch.FileChannelImpl.read(Unknown Source) at org.apache.iotdb.tsfile.read.reader.LocalTsFileInput.read(LocalTsFileInput.java:90) at org.apache.iotdb.tsfile.read.TsFileSequenceReader.readTailMagic(TsFileSequenceReader.java:229) at org.apache.iotdb.tsfile.read.TsFileSequenceReader.loadMetadataSize(TsFileSequenceReader.java:202) at org.apache.iotdb.tsfile.read.TsFileSequenceReader.(TsFileSequenceReader.java:139) at org.apache.iotdb.db.writelog.recover.TsFileRecoverPerformer.recoverResourceFromReader(TsFileRecoverPerformer.java:181) at org.apache.iotdb.db.writelog.recover.TsFileRecoverPerformer.recoverResourceFromFile(TsFileRecoverPerformer.java:175) at org.apache.iotdb.db.writelog.recover.TsFileRecoverPerformer.recover(TsFileRecoverPerformer.java:108) at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.recoverTsFiles(VirtualStorageGroupProcessor.java:785) at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.recover(VirtualStorageGroupProcessor.java:529) at org.apache.iotdb.db.engine.storagegroup.VirtualStorageGroupProcessor.(VirtualStorageGroupProcessor.java:404) at org.apache.iotdb.db.engine.StorageEngine.buildNewStorageGroupProcessor(StorageEngine.java:784) at org.apache.iotdb.db.engine.storagegroup.virtualSg.StorageGroupManager.lambda$asyncRecover$0(StorageGroupManager.java:245) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ... 1 common frames omitted 2022-11-11 17:52:10,232 [pool-10-IoTDB-Recovery-Thread-Pool-2] INFO o.a.i.d.e.s.VirtualStorageGroupProcessor:568 - The virtual storage group root.tb[0] is recovered successfully 2022-11-11 17:52:10,233 [pool-10-IoTDB-Recovery-Thread-Pool-2] INFO o.a.i.m.m.t.MicrometerAutoGauge:40 - mem,[name, storageGroup_root.tb] 2022-11-11 17:52:10,233 [pool-10-IoTDB-Recovery-Thread-Pool-2] INFO o.a.i.d.c.IoTDBThreadPoolFactory:174 - new single scheduled thread pool: WAL-Trim-root.tb-0 2022-11-11 17:52:10,234 [pool-10-IoTDB-Recovery-Thread-Pool-2] INFO o.a.i.d.e.s.v.StorageGroupManager:259 - Storage Group root.tb has been recovered 1/1 2022-11-11 17:52:10,317 [main] INFO o.a.i.d.s.t.ThriftService:135 - IoTDB: start RPC ServerService successfully, listening on ip 0.0.0.0 port 6667 2022-11-11 17:52:10,318 [main] INFO o.a.i.db.service.IoTDB:167 - IoTDB is set up, now may some sgs are not ready, please wait several seconds...

这里卡了很久