In-memory analytics benchmark run question

minkyuSnow commented 1 year ago

Hello

I am running in-memory analytics application on Arm cpu and memory 4GB

Running the benchmark with the benchmark data set set to 144MB produces the results "Movies Recommended" and "Benchmark Execution Time".

However, when I look at the message in the process of calculating the result, there is something that looks like an error message, so I wonder if this benchmark ran normally.

Here is the command I used to run the benchmark. I'm trying to do it on one node. Node spec = Arm CPU + Memory 4GB

$ docker create --name movielens-data cloudsuite/movielens-dataset $ docker run -dP --net host --name spark-master cloudsuite/spark:3.3.2 master $ docker run -dP --net host --volumes-from movielens-data --name spark-worker-01 cloudsuite/spark:3.3.2 worker spark://NODE_IP:7077 $ docker run --rm --net host --volumes-from movielens-data cloudsuite/in-memory-analytics /data/ml-latest /data/myratings.csv --driver-memory 2g --executor-memory 2g --master spark://NODE_IP:7077

스크린샷 2023-07-14 14 46 35 스크린샷 2023-07-14 14 46 54 스크린샷 2023-07-14 14 47 02

xusine commented 1 year ago

Hello,

After looking at the first error log, I believe the reason is running out of memory (Java.OutOfMemoryError). If possible, you can give more memory. Or you may consider redistribute the memory allocation for driver and executor, e.g., 1GB memory for driver and 3GB for executor. Please let me know if that helps!

minkyuSnow commented 1 year ago

Hello,

After looking at the first error log, I believe the reason is running out of memory (Java.OutOfMemoryError). If possible, you can give more memory. Or you may consider redistribute the memory allocation for driver and executor, e.g., 1GB memory for driver and 3GB for executor. Please let me know if that helps!

Thanyk you for reply.

$ docker create --name movielens-data cloudsuite/movielens-dataset $ docker run -dP --net host --name spark-master cloudsuite/spark:3.3.2 master $ docker run -dP --net host --volumes-from movielens-data --name spark-worker-01 cloudsuite/spark:3.3.2 worker spark://NODE_IP:7077 $ docker run --rm --net host --volumes-from movielens-data cloudsuite/in-memory-analytics /data/ml-latest /data/myratings.csv --driver-memory 1g --executor-memory 3g --master spark://NODE_IP:7077

As you said, driver memory 1GB and executor memory 3GB were given, but unlike when 2GB was given, it appears that the resources are insufficient. Conversely, even if 3GB of driver and 1GB of executor are given, it will not run.

If there is nothing that can be set, should it be regarded as insufficient physical memory?

스크린샷 2023-07-14 20 32 14

xusine commented 1 year ago

Hello,

Thanks for doing the test. Indeed, this means the memory is not enough to run the workload.

There might be another way around: You can try to restrict the number of cores allocated to the container using --cpuset-cpus. Memory consumption may be reduced when the worker count becomes smaller. The trade-off is that it will take longer time to finish.

minkyuSnow commented 1 year ago

Thank you for reply.

Are you saying that the problem is caused by the fact that the actual memory size is small even though there is a lot of data?

There was a problem when running by allocating 2GB, but since the result came out, can it be considered normal?

xusine commented 1 year ago

Yes. It is an implication that the physical memory is not enough. You can explain it as a normal case, but not a representative case.

minkyuSnow commented 1 year ago

Yes. It is an implication that the physical memory is not enough. You can explain it as a normal case, but not a representative case.

Thank you for reply.

I understand little bit. The result came out, but you're saying that it's hard to see it normally because a memory error came out?

As an example, I will show you a picture of the result. 스크린샷 2023-07-14 22 04 07

xusine commented 1 year ago

Hello,

Yes. Even though you finally have the result and workload successfully finished, my understanding is that it still cannot represent a real server: This workload is supposed to run on a server with large amount of memory, so you should not see any out-of-memory error during running.

However, it is OK if your ideal case is not a server :)

Best,

minkyuSnow commented 1 year ago

Hello,

Yes. Even though you finally have the result and workload successfully finished, my understanding is that it still cannot represent a real server: This workload is supposed to run on a server with large amount of memory, so you should not see any out-of-memory error during running.

However, it is OK if your ideal case is not a server :)

Best,

Thank you for your kind reply. You have been very helpful. Thank you

parsa-epfl / cloudsuite

In-memory analytics benchmark run question #436