The idea is to leverage AI services to provide a declarative interface like REST Client.
You would describe the interaction point with the LLM using an interface and annotations.
For example:
@RegisterAiService(
name=... // Optional - override the config key, can also be named `configKey`
chatModel = ...// String - the chat model (or streaming chat model) identified as a name. If not set, use the default one (validated and set at build time)
tools = ...// List<String> - the list of bean identifiers providing tools (validated at build time); if not set, all tools are available
chatMemory = ...// String - the bean identifier for the chat memory; if not set, use the default one
moderationModel = .. // String - the bean identifier for the chat memory; if not set, use the default one (no moderation)
retriever = ... // String - the bean identifier of the RAG
)
public interface MyAiService {
// ...
}
All the attributes are optional, meaning that the following snippet would use sensible defaults:
@RegisterAiService
public interface MyAiService {
// ...
}
All the configurations can be set in the application.properties using the quarkus.aiservices.$name.attr=value syntax (prefix not decided yet).
While the skeleton will be doable at build time, it will not be possible to initialize everything at build time, as the RAG may connect to the store (it might be interesting to preload the in-memory store at build time, but in the general case, it would not work).
It should be possible to use fault-tolerance annotation on the AiService methods. (timeout, retry, or even circuit breaker...).
If OTel is available, each method would be timed and counter automatically. The outcome would also be monitored.
If an audit service is available (See #12), each method invocation will be audited.
Other extensions:
Tools category - in a bean providing tools, we may need security feature (authentication) or identify the tools category. Specific processing (such as authentication) can be applied before calling the tool. That also means we need tools interceptor. We can reuse CDI interceptors. However, we may need access to the context (both the conversational context and the duplicated context).
The idea is to leverage AI services to provide a declarative interface like REST Client. You would describe the interaction point with the LLM using an interface and annotations.
For example:
All the attributes are optional, meaning that the following snippet would use sensible defaults:
All the configurations can be set in the
application.properties
using thequarkus.aiservices.$name.attr=value
syntax (prefix not decided yet).While the skeleton will be doable at build time, it will not be possible to initialize everything at build time, as the RAG may connect to the store (it might be interesting to preload the in-memory store at build time, but in the general case, it would not work).
It should be possible to use fault-tolerance annotation on the
AiService
methods. (timeout, retry, or even circuit breaker...). If OTel is available, each method would be timed and counter automatically. The outcome would also be monitored. If an audit service is available (See #12), each method invocation will be audited.Other extensions: