Shared-Reality-Lab / IMAGE-server

IMAGE project server components
Other
2 stars 7 forks source link

Create various memory/GPU/CPU docker configs for different server scenarios #333

Open jeffbl opened 2 years ago

jeffbl commented 2 years ago

Normally, for production on pegasus, we should be running assuming we have at least 18GB of RAM, and most of the CPU cores. However, I'm proposing we create three different profiles we can switch to (e.g., we might run MINGPU on unicorn, but HIGHQUALITY on pegasus):

HIGHQUALITY: Assuming we have all of a high-end machine, focused on highest quality results for a small number of queries at a time. This is what we should run for CSUN, for example, on the assumption that we may get considerable use, but probably not a huge number of simultaneous queries within any 5s window. Target max time for a single query: 5s

MULTIUSER: Assuming we are getting multiple queries simultaneously. Reduce result quality as necessary to support n simultaneous queries without an OOM condition. n=5 is probably a good place to start (avg 1 query/s). May have to switch to this during CSUN if usage is higher than expected. Target max time for a single query: 5s

LOWGPU: Minimize or completely eliminate GPU use. Can be used for testing on local machine for debugging, or a low-end server. Time may get long, and quality may be significantly reduced. Target max time for a single query: 10s

Levers we can adjust:

Future extensions:

jeffbl commented 2 years ago

Assigning to @rianadutta, since this is the focus of her SURE internship.