Is your feature request related to a problem? Please describe.
As an operations person I want to be sure that the use of LLM calls to external API's does not become cost prohibitive.
Describe the solution you'd like
The guidance engine uses an external service provider for LLM inference for which costs are charged to Alkemio. Whilst individual calls are relatively cheap, the costs could still become prohibitive if the usage patterns are different from what is expected, e.g. due to a bug or some type of misuse by the user. The majority of controls should be managed through the server, but it is probably prudent to build in some simple reporting/rate limiting on the guidance engine side.
Describe alternatives you've considered
Full control of the API calls to the engine on the server side.
Additional context
Over time, more granular controls may shift to the server side. Also, the use of LLM is expected to grow, which will create additional/different requiremenrts for cost management.
Is your feature request related to a problem? Please describe. As an operations person I want to be sure that the use of LLM calls to external API's does not become cost prohibitive.
Describe the solution you'd like The guidance engine uses an external service provider for LLM inference for which costs are charged to Alkemio. Whilst individual calls are relatively cheap, the costs could still become prohibitive if the usage patterns are different from what is expected, e.g. due to a bug or some type of misuse by the user. The majority of controls should be managed through the server, but it is probably prudent to build in some simple reporting/rate limiting on the guidance engine side.
Describe alternatives you've considered Full control of the API calls to the engine on the server side.
Additional context Over time, more granular controls may shift to the server side. Also, the use of LLM is expected to grow, which will create additional/different requiremenrts for cost management.