Open cdoern opened 1 month ago
As noted in the issue description, I would like to work on some version of this and have some code locally already I might make into a draft PR. If possible, please add me as the assignee!
In addition to the get, list, register and unregister services, how about adding a (device) discovery service to guess the best configuration?
🚀 Describe the new functionality needed
Configuration API
adding providers outside of the current scope will likely necessitate the following:
I imagine this functionality working similarly to
Models
orInspect
where these are a high level API. Additionally these objects should be applicable for other providers to "register" one of them.Configurations
similarly to models, should operate as an "overarching" API that one can register, list, get, and unregister a configuration.usage pattern:
llama stack build && llama stack run (administrator starts stack)
a user could run:
llama-stack-client configurations inspect
llama-stack-client configurations register --config <file_path>
or using the SDK:
the configuration API would look something like:
With the inspect API expanded to have a /configurations endpoint:
UserConfig vs StackRunConfig
A key part of this API are the fields exposed in both the inspection and registration. A Configuration object contains a
StackRunConfig
within it. However, the data within this config is aUserConfig
. A UserConfig is aStackRunConfig
but only with specific fields displayed to the user. Since each provider has its own config class that feeds into the StackRunConfig the following can be used to label certain fields as "User Configurable":url: str = Field(DEFAULT_OLLAMA_URL, json_schema_extra={"user_field": True})
the pydantic
json_schema_extra
field can then be used when creating aConfiguration
object to create an intermediaryUserConfig
. The User Config will only have fields labeled as user_field meaning that if a user tries to register a configuration with non-user fields specified, they will be dropped, and an inspected configuration will only contain user fields for viewing as well. In the above example theurl
is the only field given theuser_field
schema which is why it is one of the few things showing up.Server Side Device Discovery for Initial Configuration
Before a user can inspect or register a config of their own, it would make sense to allow providers to utilize a centralized hardware discovery service built into llama-stack. Providers could then act on this information inside of their configuration initialization methods to apply certain defaults depending on the hardware discovered as opposed to a blanket set of defaults.
💡 Why is this needed? What if we don't build it?
Without a system like the above, it will be difficult to orchestrate a sequence of providers intended to "work together" or even a single complex provider to be easily accessible to users. Additionally, the more complex APIs and providers that are introduced, the greater odds runtime manipulation of key configuration fields will be necessary.
Say someone provides a data generation, training, and evaluation methodology as separate providers, and each of these depends on specific hardware requirements, hyper parameters, etc to interact with one another and these parameters change per hardware (H100 vs A100 vs L40).
Exposing the current provider configuration to a user will help them understand what they will be running for various providers as functionality gets more complex (SDG, Evals, Training, etc). Additionally, allowing a user to apply parts of a config on top of a running stack as opposed to taking the stack down and having the admin apply a full run config again seems like a more sustainable workflow.
Other thoughts
I would like to work on this in collaboration with anyone if possible!