exacaster / lighter

REST API for Apache Spark on K8S or YARN
MIT License
91 stars 21 forks source link

Sizing cpu/mem versus number of sessions #611

Closed julienlau closed 2 weeks ago

julienlau commented 1 year ago

Hi,

The initial settings is maximum 5 batches running + 5 starting. There are no settings on the max number of sessions. Do you have some idea of the maximum number of sessions ? Is it ok to have 100 sessions ? How would you size lighter pod (cpu, mem) in this case in a configuration where the DB is stored on a dedicated postgres ?

During my tests, I stick to 2-3 sessions max and H2:file and I observed that lighter pods goes fine with mem=1GB & cpu=1.

Regards, JL

Minutis commented 1 year ago

We did not have a use case where we need 100 sessions running so it's hard to say what would happen. If I would need to guess I would say that there would probably be issues not with the sessions but with the internals of the Lighter itself. Not sure how long it would take to check this number of sessions for status updates and this might cause some side issues as mentioned in https://github.com/exacaster/lighter/issues/617.

Regarding the sizing of the pods, again, I do not have any benchmarks available. But from my experience, Lighter is not memory heavy application, and running it for long periods of time does not increase memory footprint enough to notice. Regarding the CPU maybe @pdambrauskas could comment on parallelism when doing scheduled tasks? Because Lighter itself is lightweight and transparent enough not to require a lot of resources.

pdambrauskas commented 1 year ago

In case of the process, which is responsible for launching newly scheduled sessions, it starts max 10 sessions per minute. From what I saw, CPU and memory usage increases significantly, only in cases, when significantly bigger numbers of applications are being launched in parallel. We have this value configurable for batch jobs.

julienlau commented 1 year ago

Ok thanks.

If scaling the number of session may be an issue, then it is confusing to have parameters for the max number of batches and not for the max number of sessions.

Unfortunately, I do not have a large k8s cluster to load test lighter with a lot of sessions.

Minutis commented 2 weeks ago

Closing the issue as stale for now.