Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud
https://www.alluxio.io
Apache License 2.0
6.81k stars 2.94k forks source link

Block 1174405218 is expected to be 67108864 bytes, but only 0 bytes are available. #12102

Closed Kmannth closed 1 year ago

Kmannth commented 4 years ago

Alluxio Version: 2.3.0 from alluxio-2.3.0-bin.tar.gz

Describe the bug While running a Kmeans Bench using a data set > 2x of DRAM size we get an error like "java.lang.IllegalStateException: Block 1174405218 is expected to be 67108864 bytes, but only 0 bytes are available."

To Reproduce I don't know the minimal way to repdouce just what we do:

Setup a 4x spark system with :

Software: CentOS Linux release 7.7.1908 (Core) Alluxio: 2.3.0 - Opensouce tar file download Spark: spark-2.4.6-bin-hadoop2.7 - Opensouce tar file download Lustre ZFS : 2.12.2

Woker HW Details:
384GB Ramm 2x Xeon
10GB ethernet 100GB OPA (not use with Allxuio)

We use a Lustre is a distribued posix under storage just like NFS from how it works with Alluxio.

Basic settings for Alluxio are:

alluxio.worker.memory.size=20GB alluxio.worker.tieredstore.levels=1 alluxio.worker.tieredstore.level0.alias=MEM alluxio.worker.tieredstore.level0.dirs.path=/mnt/ramdisk

We have see the error with memory.size as high as 100GB.

Also: ALLUXIO_READ_TYPE is "CACHE" ALLUXIO_WRITE_TYPE is "CACHE_THROUGH"

Configure the test:

download https://github.com/CODAIT/spark-bench/tree/legacy Configure for use with allxuio Set Kmeans {noformat} NUM_OF_POINTS=1800000000 NUM_OF_CLUSTERS=8 DIMENSIONS=70 SCALING=0.6 NUM_OF_PARTITIONS=33 {noformat} the above settings in spark-bench/KMeans/conf/env.sh will create around a 2TB data set in 33 files. Adjust NUM_OF_PARTITIONS to the number of files (this changes the size of each file.

Run the test: spark-bench/KMeans/bin/gen_data.sh spark-bench/KMeans/bin/run.sh

Wait for things to crash. (can be hours) We see errors like this (the block number can vary) ' 20/09/03 17:08:53 INFO FileInputFormat: Total input paths to process : 132 20/09/03 18:55:38 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS 20/09/03 18:55:38 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS 20/09/04 04:43:01 ERROR AsyncEventQueue: Dropping event from queue appStatus. This likely means one of the listeners is too slow and cannot keep up with the rate at which tasks are being started by the scheduler. 20/09/04 04:43:01 WARN AsyncEventQueue: Dropped 1 events from appStatus since Wed Dec 31 16:00:00 PST 1969. 20/09/05 08:02:59 WARN TaskSetManager: Lost task 25207.0 in stage 292.0 (TID 10782288, 10.54.4.73, executor 30): java.lang.IllegalStateException: Block 1174405218 is expected to be 67108864 bytes, but only 0 bytes are available. Please ensure its metadata is consistent between Alluxio and UFS. at alluxio.shaded.client.com.google.common.base.Preconditions.checkState(Preconditions.java:842) at alluxio.client.block.stream.BlockInStream.read(BlockInStream.java:267) at alluxio.client.file.AlluxioFileInStream.read(AlluxioFileInStream.java:186) at alluxio.hadoop.HdfsFileInputStream.read(HdfsFileInputStream.java:115) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:62) at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174) at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:94) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:248) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:48) at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:293) at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:224) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409) at org.apache.spark.storage.memory.PartiallyUnrolledIterator.hasNext(MemoryStore.scala:753) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at org.apache.spark.rdd.RDD$$anonfun$zip$1$$anonfun$apply$26$$anon$2.hasNext(RDD.scala:911) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at org.apache.spark.mllib.clustering.KMeans$$anonfun$6.apply(KMeans.scala:309) at org.apache.spark.mllib.clustering.KMeans$$anonfun$6.apply(KMeans.scala:302) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:823) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:823) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346) at org.apache.spark.rdd.RDD.iterator(RDD.scala:310) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55) at org.apache.spark.scheduler.Task.run(Task.scala:123) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ' Expected behavior The tests runs and finishes without error.

Urgency We have tested Alluxio 2.2.2 and after over 12 hours of testing we have not see the error again. We will continue to evalute 2.2.2 but it look better so far. If 2.2.2 works that is fine for what we are doing.

Additional context This is related to gettitng Allxuio meged with the Magpie project. https://github.com/LLNL/magpie/pull/321 is the current develpment to date.

Kmannth commented 4 years ago

With 2.2.2 we stared testing with 20GB of memory for Alluxio and things seemed ok. Over thew weekend we moved to 100GB of memory pre client and hit the issues again. 2.2.2 seems harder to trigger but still there.

A few things to note: The Kmeans setting that we hit this with are with

" MAX_ITERATION=300 NUM_RUN=10 "

This means the Kmeans test will read the data around 3000 times so we could be hitting some corner case in Alluxio the Ramdisk or Lustre (The underfilesystem). Configuring the the test to run just 2 times or with less data and we don't see the error messages.

After around 5 hours of running KMeans with 100GB memory seettings for Alluxio and the stressful settings it died. When I restared the kmeans benchmark again the read failed on the same block after a few mins. It seems like once it happens it stays in this state but I did not know what to check or attempt so I restated alluxio removed the data and started everthing again.

Also the systems do not swap but there is memory preasue when the system is running.

Kmannth commented 4 years ago

Here is the 2nd error backtrace: 20/09/12 10:15:35 INFO FileInputFormat: Total input paths to process : 264 20/09/12 11:06:34 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS 20/09/12 11:06:34 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS 20/09/12 18:54:04 WARN TaskSetManager: Lost task 17478.0 in stage 122.0 (TID 2531518, 10.54.4.73, executor 14): java.lang.IllegalStateException: Block 4261412871 is expected to be 67108864 bytes, but only 0 bytes are available. Please ensure its metadata is consistent between Alluxio and UFS. at alluxio.shaded.client.com.google.common.base.Preconditions.checkState(Preconditions.java:842) at alluxio.client.block.stream.BlockInStream.read(BlockInStream.java:267) at alluxio.client.file.AlluxioFileInStream.read(AlluxioFileInStream.java:186) at alluxio.hadoop.HdfsFileInputStream.read(HdfsFileInputStream.java:115) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:62) at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174) at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:94) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:248) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:48) at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:293) at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:224) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409) at org.apache.spark.storage.memory.PartiallyUnrolledIterator.hasNext(MemoryStore.scala:753) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at org.apache.spark.rdd.RDD$$anonfun$zip$1$$anonfun$apply$26$$anon$2.hasNext(RDD.scala:911) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at org.apache.spark.mllib.clustering.KMeans$$anonfun$6.apply(KMeans.scala:309) at org.apache.spark.mllib.clustering.KMeans$$anonfun$6.apply(KMeans.scala:302) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:823) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:823) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346) at org.apache.spark.rdd.RDD.iterator(RDD.scala:310) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55) at org.apache.spark.scheduler.Task.run(Task.scala:123) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

Kmannth commented 4 years ago

Is there a way I can check the underfilesystem when this happens?

loowman commented 3 years ago

@Kmannth i think you can check underfilesystem file whether change

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

jja725 commented 1 year ago

Will close it for now, feel free to reopen it and contact us if this is still valid.