aws-samples / eks-spark-benchmark

Performance optimization for Spark running on Kubernetes
Apache License 2.0
85 stars 28 forks source link

Consistent store / ListBucket operations #13

Open itayB opened 2 years ago

itayB commented 2 years ago

Limitations: It is suggested to use a consistent store with staging committers together. from here.

Can you elaborate more about that? Isn't S3 has Strong Consistency?

From my AWS bills I see that we're doing a lot of ListBucket requests. I wonder if the default committer is responsible for that under the hood.

Do I have to use either EFS (for staging committer) or DynamoDB (for magic committer)?

itayB commented 2 years ago

I think that I found the answer here: https://github.com/apache/spark/pull/32518