Ticket: Provide Interface to Request and Release LLMs for Prompting
User Story
As a user, I want an interface (e.g., an HTTP API) to request a specific LLM and send prompts to it.
Description
The LLM Registry must provide an interface that allows the Prompting Service to:
Request an LLM:
The Registry assigns an LLM Wrapper hosting the requested LLM and reserves it for the Prompting Service.
If the requested LLM is not available, the Registry deploys it on an available Wrapper or waits for resources to free up.
The interface returns a URL where the Prompting Service can retrieve details about the LLM or check its deployment status.
Monitor Deployment Status:
The URL provided by the Registry allows the Prompting Service to query the LLM status (e.g., "Deploying LLM" or "Waiting for available machine") until the LLM is ready.
Release the LLM:
The Prompting Service explicitly releases the LLM when it is no longer needed, making it available for other users.
Workflow
The Prompting Service sends a request to the Registry for a specific LLM.
The Registry identifies an appropriate LLM Wrapper or deploys the requested LLM if not available.
The Registry provides a temporary URL to the Prompting Service:
This URL gives information about the LLM or its deployment status.
The URL is valid until the Prompting Service retrieves the information.
Once the Prompting Service retrieves the details, the LLM is marked as reserved.
The Prompting Service uses the LLM and releases it via the Registry once done.
Acceptance Criteria
[ ] The interface allows the Prompting Service to request an LLM and receive a URL with LLM details or deployment status.
[ ] The interface ensures the LLM is marked as reserved once the Prompting Service retrieves the URL's information.
[ ] The interface allows the Prompting Service to release the reserved LLM, making it available for other users.
[ ] The Registry deploys an LLM if the requested LLM is unavailable.
[ ] The interface provides clear and accurate status updates during the deployment process (e.g., "Deploying," "Waiting for available machine").
[ ] The URL provided to the Prompting Service is valid only until it retrieves the LLM details.
Test Criteria
[ ] Functional Tests:
[ ] Test LLM request functionality and verify the correct URL is returned.
[ ] Simulate scenarios where the requested LLM is not available and ensure the Registry initiates deployment.
[ ] Test status updates during deployment and verify they are accessible via the provided URL.
[ ] Verify the LLM is marked as reserved once details are retrieved.
[ ] Test that the LLM can be released and becomes available for other users.
[ ] Error Handling:
[ ] Test scenarios where no Wrappers are available and ensure appropriate error messages or status updates are returned.
[ ] Verify that expired or invalid URLs are handled gracefully.
[ ] Performance Tests:
[ ] Measure the time taken to process LLM requests under normal and high-load conditions.
Ticket: Provide Interface to Request and Release LLMs for Prompting
User Story
As a user, I want an interface (e.g., an HTTP API) to request a specific LLM and send prompts to it.
Description
The LLM Registry must provide an interface that allows the Prompting Service to:
Request an LLM:
Monitor Deployment Status:
Release the LLM:
Workflow
Acceptance Criteria
Test Criteria