JahstreetOrg / spark-on-kubernetes-helm

Spark on Kubernetes infrastructure Helm charts repo
Apache License 2.0
199 stars 76 forks source link

Issues with load testing livy #80

Open RajatSablok opened 2 years ago

RajatSablok commented 2 years ago

I am load testing (using locust) my livy server which is deployed on k8 pods using this helm. I have tried testing with session recovery enabled on both zookeeper and filesystem. We also have basic auth enabled on our server using nginx ingress.

My config for session recovery looks like this:

LIVY_LIVY_SERVER_RECOVERY_MODE: {value: "recovery"}
LIVY_LIVY_SERVER_RECOVERY_STATE0STORE: {value: "filesystem"}
LIVY_LIVY_SERVER_RECOVERY_STATE0STORE_URL: {value: "file:///tmp/livy/store/state"}

These are some of the logs/issues that I am getting when testing with session recovery enabled on the filesystem:

2022-05-02 14:12:45,487 : livy_test : CRITICAL : ERROR IN SUBMIT BATCHES: 500 "java.io.FileNotFoundException: File /tmp/livy/store/state/v1/batch/state.tmp does not exist"

2022-05-02 14:12:44,467 : livy_test : CRITICAL : ERROR IN SUBMIT BATCHES: 500 "org.apache.hadoop.fs.FileAlreadyExistsException: rename destination /tmp/livy/store/state/v1/batch/state already exists."

2022-05-02 14:12:46,950 : livy_test : CRITICAL : ERROR IN SUBMIT BATCHES: 500 "ExitCodeException exitCode=1: chmod: cannot access '/tmp/livy/store/state/v1/batch/.state.tmp.crc': No such file or directory\n"

Load testing configuration (locust conf):

users = 100
spawn-rate = 100
run-time = 1m

Can someone help with why I might be getting the above errors?