materialsproject / fireworks

The Fireworks Workflow Management Repo.
https://materialsproject.github.io/fireworks
Other
351 stars 184 forks source link

Too many webgui workers #467

Closed janosh closed 2 years ago

janosh commented 2 years ago

When launching the web GUI, fireworks instantiates what seems to me an unnecessarily large number of workers based on the available CPU thread count.

$ lpad webgui -s
[2021-11-19 12:27:30 +0000] [9192] [INFO] Starting gunicorn 20.1.0
[2021-11-19 12:27:30 +0000] [9192] [INFO] Listening at: http://127.0.0.1:5000 (9192)
[2021-11-19 12:27:30 +0000] [9192] [INFO] Using worker: sync
[2021-11-19 12:27:30 +0000] [9193] [INFO] Booting worker with pid: 9193
[2021-11-19 12:27:30 +0000] [9194] [INFO] Booting worker with pid: 9194
[2021-11-19 12:27:30 +0000] [9195] [INFO] Booting worker with pid: 9195
[2021-11-19 12:27:30 +0000] [9196] [INFO] Booting worker with pid: 9196
[2021-11-19 12:27:30 +0000] [9197] [INFO] Booting worker with pid: 9197
[2021-11-19 12:27:30 +0000] [9198] [INFO] Booting worker with pid: 9198
[2021-11-19 12:27:30 +0000] [9199] [INFO] Booting worker with pid: 9199
[2021-11-19 12:27:30 +0000] [9200] [INFO] Booting worker with pid: 9200
[2021-11-19 12:27:30 +0000] [9201] [INFO] Booting worker with pid: 9201
[2021-11-19 12:27:30 +0000] [9202] [INFO] Booting worker with pid: 9202
[2021-11-19 12:27:30 +0000] [9203] [INFO] Booting worker with pid: 9203
[2021-11-19 12:27:31 +0000] [9204] [INFO] Booting worker with pid: 9204
[2021-11-19 12:27:31 +0000] [9205] [INFO] Booting worker with pid: 9205
[2021-11-19 12:27:31 +0000] [9206] [INFO] Booting worker with pid: 9206
[2021-11-19 12:27:31 +0000] [9207] [INFO] Booting worker with pid: 9207
[2021-11-19 12:27:31 +0000] [9208] [INFO] Booting worker with pid: 9208
[2021-11-19 12:27:31 +0000] [9209] [INFO] Booting worker with pid: 9209
[2021-11-19 12:27:31 +0000] [9210] [INFO] Booting worker with pid: 9210
[2021-11-19 12:27:31 +0000] [9211] [INFO] Booting worker with pid: 9211
[2021-11-19 12:27:31 +0000] [9212] [INFO] Booting worker with pid: 9212
[2021-11-19 12:27:31 +0000] [9213] [INFO] Booting worker with pid: 9213
[2021-11-19 12:27:31 +0000] [9214] [INFO] Booting worker with pid: 9214
[2021-11-19 12:27:31 +0000] [9215] [INFO] Booting worker with pid: 9215
[2021-11-19 12:27:31 +0000] [9216] [INFO] Booting worker with pid: 9216
[2021-11-19 12:27:31 +0000] [9217] [INFO] Booting worker with pid: 9217
[2021-11-19 12:27:31 +0000] [9218] [INFO] Booting worker with pid: 9218
[2021-11-19 12:27:32 +0000] [9219] [INFO] Booting worker with pid: 9219
[2021-11-19 12:27:32 +0000] [9220] [INFO] Booting worker with pid: 9220
[2021-11-19 12:27:32 +0000] [9221] [INFO] Booting worker with pid: 9221
[2021-11-19 12:27:32 +0000] [9222] [INFO] Booting worker with pid: 9222
[2021-11-19 12:27:32 +0000] [9223] [INFO] Booting worker with pid: 9223
[2021-11-19 12:27:32 +0000] [9224] [INFO] Booting worker with pid: 9224
[2021-11-19 12:27:32 +0000] [9225] [INFO] Booting worker with pid: 9225

My guess is one worker would be enough rather than the 33 created based on my multiprocessing.cpu_count() being 16. The large number of workers seem to come from this copy-pasted gunicorn example

https://github.com/materialsproject/fireworks/blob/8ff239a8d5dd67c8c5ccf78368aca8a9714165ae/fireworks/flask_site/gunicorn.py#L1-L20

Copy-paste source from wayback machine:

https://web.archive.org/web/20190831205610/http://docs.gunicorn.org:80/en/19.6.0/custom.html

mkhorton commented 2 years ago

(2 x $num_cores) + 1 is actually the recommendation in the gunicorn docs under the logic "one worker will be reading or writing from the socket while the other worker is processing a request."

I can't speak for this code specifically, but in my experience num workers > num CPUs can be helpful if your workers are IO bound (eg see gevent workers for this too).

For FireWorks specifically, since there is typically only going to be a single person accessing the UI, I agree this is likely more workers than necessary, but people have had multi-user deployments before.

janosh commented 2 years ago

For FireWorks specifically, since there is typically only going to be a single person accessing the UI, I agree this is likely more workers than necessary, but people have had multi-user deployments before.

Seems like in that case the ideal setup would be 1 - 2 workers by default and people with a multi-user setup can pass a flag to increase workers.

mkhorton commented 2 years ago

Yeah, the main problem with excess workers is typically the memory requirements, and I'm not sure off-hand if that is onerous with the FireWorks GUI or not. If it's not onerous, having the gunicorn-recommended num workers is probably fine.

mkhorton commented 2 years ago

I should clarify that I don't maintain the FireWorks package, I'm just loitering here :)

janosh commented 2 years ago

I only have 16 GB of memory on my machine so I always appreciate when apps use resources judiciously. It's also too verbose atm having the terminal fill up with process IDs every time I launch the GUI. Usability would also benefit IMO.