I got "Filesystem closed" error sometimes.
enviroment: Presto 0.237.1
with Kerberized Hive 3.1/HDFS 3.1.1.
Two scenarios happened
1、
tracebacks:
com.facebook.presto.spi.PrestoException: Filesystem closed
at com.facebook.presto.hive.GenericHiveRecordCursor.advanceNextPosition(GenericHiveRecordCursor.java:227)
at com.facebook.presto.hive.HiveRecordCursor.advanceNextPosition(HiveRecordCursor.java:175)
at com.facebook.presto.$gen.CursorProcessor_20200729_120431_381.process(Unknown Source)
at com.facebook.presto.operator.ScanFilterAndProjectOperator.processColumnSource(ScanFilterAndProjectOperator.java:248)
at com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:240)
at com.facebook.presto.operator.Driver.processInternal(Driver.java:382)
at com.facebook.presto.operator.Driver.lambda$processFor$8(Driver.java:284)
at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:672)
at com.facebook.presto.operator.Driver.processFor(Driver.java:277)
at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1077)
at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:162)
at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:545)
at com.facebook.presto.$gen.Presto_0_237_e369a5a____20200717_080617_1.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:860)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:926)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:62)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:238)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:193)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:94)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:248)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:48)
at com.facebook.presto.hive.GenericHiveRecordCursor.advanceNextPosition(GenericHiveRecordCursor.java:209)
... 15 more
Suppressed: java.io.UncheckedIOException: java.io.IOException: Filesystem closed
at com.facebook.presto.hive.GenericHiveRecordCursor.close(GenericHiveRecordCursor.java:541)
at com.facebook.presto.hive.HiveUtil.closeWithSuppression(HiveUtil.java:883)
at com.facebook.presto.hive.GenericHiveRecordCursor.advanceNextPosition(GenericHiveRecordCursor.java:223)
... 15 more
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:702)
at java.io.FilterInputStream.close(FilterInputStream.java:181)
at org.apache.hadoop.util.LineReader.close(LineReader.java:166)
at org.apache.hadoop.mapred.LineRecordReader.close(LineRecordReader.java:284)
at com.facebook.presto.hive.GenericHiveRecordCursor.close(GenericHiveRecordCursor.java:538)
... 17 more
2、
tracebacks
com.facebook.presto.spi.PrestoException: Failed to read ORC file: hdfs://xxxx
at com.facebook.presto.hive.orc.OrcBatchPageSource.getNextPage(OrcBatchPageSource.java:166)
at com.facebook.presto.hive.HivePageSource.getNextPage(HivePageSource.java:126)
at com.facebook.presto.operator.ScanFilterAndProjectOperator.processPageSource(ScanFilterAndProjectOperator.java:272)
at com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:237)
at com.facebook.presto.operator.Driver.processInternal(Driver.java:382)
at com.facebook.presto.operator.Driver.lambda$processFor$8(Driver.java:284)
at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:672)
at com.facebook.presto.operator.Driver.processFor(Driver.java:277)
at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1077)
at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:162)
at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:545)
at com.facebook.presto.$gen.Presto_0_237_e369a5a____20200717_080617_1.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.UncheckedIOException: java.io.IOException: Filesystem closed
at com.facebook.presto.hive.orc.OrcBatchPageSource.close(OrcBatchPageSource.java:184)
at com.facebook.presto.hive.orc.OrcBatchPageSource.getNextPage(OrcBatchPageSource.java:139)
... 14 more
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:702)
at java.io.FilterInputStream.close(FilterInputStream.java:181)
at com.facebook.presto.hive.orc.HdfsOrcDataSource.close(HdfsOrcDataSource.java:56)
at com.facebook.presto.orc.CachingOrcDataSource.close(CachingOrcDataSource.java:124)
at com.google.common.io.Closer.close(Closer.java:214)
at com.facebook.presto.orc.AbstractOrcRecordReader.close(AbstractOrcRecordReader.java:388)
at com.facebook.presto.orc.OrcBatchRecordReader.close(OrcBatchRecordReader.java:39)
at com.facebook.presto.hive.orc.OrcBatchPageSource.close(OrcBatchPageSource.java:181)
but retry , the same SQL can run successfully.
My Question:
Why did this happen?
How to prevent it from happening again?
Dear team
I got "Filesystem closed" error sometimes. enviroment: Presto 0.237.1 with Kerberized Hive 3.1/HDFS 3.1.1. Two scenarios happened 1、 tracebacks:
2、 tracebacks
but retry , the same SQL can run successfully. My Question: Why did this happen? How to prevent it from happening again?