Open achimnol opened 1 year ago
The current implementation takes either of the specified shmem or the minimum shmem value required by image and adds it to the minimum mem value required by image, and compares it to the user-specified mem value.
In other words, it compares (S or ai.backend.resource.min.shmem
) + ai.backend.resource.min.mem
to M . Are we going to change this behavior to check ai.backend.resource.min.mem
< M + S?
And how should we handle the case of max
resource slot? Should it be ai.backend.resource.max.mem
> M + S ?
This is a late follow-up to lablup/backend.ai-webui#314.
Currently, the image label
ai.backend.resource.min.mem
is interpreted as the main memory size, excluding the shared memory size. However, the web UI's resource configuration automatically sets the shared memory size ($S$) to 64 MiB for the main memory sizes ($M$) less than 4 GiB, while our scheduler allocates the sum of the main memory size and the shared memory size ($M+S$).This makes a confusion when allocating the least amount of memory $M+S = 256\ \mathrm{MiB}$, for instance, because it epic-fails as the web UI sends $M = 192\ \mathrm{MiB}$ and $S = 64\ \mathrm{MiB}$, while the manager's enqueue-session API handler compares the image label
ai.backend.resource.min.mem
with $M$ only and requires $M \ge 256\ \mathrm{MiB}$.We are going to update the web UI to hide the detailed shared memory configuration for most use cases, and the memory resource slider will expose $M+S$ with auto-configured $S$ depending on the value of $M + S$.
To better support the above web UI update, let's change the enqueue-session API handler to:
ai.backend.resource.min.mem
with $M+S$ instead of $M$.The Client SDK and CLI should still expose the raw configurations as the options. So, let's: