This issue extends and replaces lablup/backend.ai-agent#82.
What's per-image metadata incldues:
feature sets, service ports, etc.
supported accelerators
minimum required resource limits: read from image labels
maximum allowed resource limits: per-scaling group configuration + per-user resource policy
For an unforeseen image upon kernel creation requests, the manager (and the client) should explicitly display a guide message to the user about registering appropriate resource configuration.
Or, it should fall-back to a minimal resource limits with a warning message.
We need to update all kernels. (lablup/backend.ai-kernels#98)
We need to extend the images API. (#119, #94)
Behavior of per-image metadata
The registry is the only authorative source of per-image metadata.
If the hash digests of the pulled image and the registry image are different, agents should always pull the registry image. (related: #126)
When we provide image dumps for off-line setups, they also need to be pushed to a private registry to become available.
The manager may cache the image metadata by periodically scanning the registry.
Scanning may be triggered by the user manually, via a configuration API. (related: #118)
Agents report their image availability via heartbeats, including hash digests.
The manager determines whether an agent needs to pull the image or not by comparing this information with its image metadata.
When the manager commands an agent to pull a new image, it should pass the authentication tokens to the agent as well.
This issue extends and replaces lablup/backend.ai-agent#82.
References