mlflow / mlflow

Open source platform for the machine learning lifecycle
https://mlflow.org
Apache License 2.0
18.24k stars 4.13k forks source link

[BUG] MLFlow Deployments Server for LLMs Not Supporting Chat for Bedrock #12615

Open awolf0 opened 1 month ago

awolf0 commented 1 month ago

Issues Policy acknowledgement

Where did you encounter this bug?

Other

Willingness to contribute

Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.

MLflow version

System information

Describe the problem

I have set up two endpoints for Anthropic Claude 3 Sonnet on Bedrock, one for completions, and one for chat. Sending a request to the completions endpoint returns "claude-3-sonnet-20240229" is not supported on this API. Please use the Messages API instead. Sending a request to the chat endpoint returns The chat route is not implemented for Amazon Bedrock models..

Tracking information

REPLACE_ME

Code to reproduce issue

configuration:

endpoints:
  - name: eu-central-1.anthropic.claude-3-sonnet-completions
    endpoint_type: llm/v1/completions
    model:
      provider: bedrock
      name: anthropic.claude-3-sonnet-20240229-v1:0
      config:
        aws_config:
          aws_region: eu-central-1
          aws_role_arn: arn:aws:iam::XXXX:role/BedrockAccessRole

  - name: eu-central-1.anthropic.claude-3-sonnet
    endpoint_type: llm/v1/chat
    model:
      provider: bedrock
      name: anthropic.claude-3-sonnet-20240229-v1:0
      config:
        aws_config:
          aws_region: eu-central-1
          aws_role_arn: arn:aws:iam::XXXX:role/BedrockAccessRole

Stack trace

Traceback (most recent call last):
  File "/usr/src/app/venv/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/venv/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/venv/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/usr/src/app/venv/lib/python3.11/site-packages/instana/instrumentation/asgi.py", line 106, in __call__
    raise exc
  File "/usr/src/app/venv/lib/python3.11/site-packages/instana/instrumentation/asgi.py", line 103, in __call__
    await self.app(scope, receive, send_wrapper)
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/routing.py", line 756, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
    await route.handle(scope, receive, send)
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle
    await self.app(scope, receive, send)
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/src/app/venv/lib/python3.11/site-packages/starlette/routing.py", line 72, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/venv/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/venv/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/venv/lib/python3.11/site-packages/mlflow/deployments/server/app.py", line 104, in _completions
    return await prov.completions(payload)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/venv/lib/python3.11/site-packages/mlflow/gateway/providers/bedrock.py", line 286, in completions
    response = self._request(payload)
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/venv/lib/python3.11/site-packages/mlflow/gateway/providers/bedrock.py", line 265, in _request
    response = self.get_bedrock_client().invoke_model(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/venv/lib/python3.11/site-packages/botocore/client.py", line 565, in _api_call
    return self._make_api_call(operation_name, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/venv/lib/python3.11/site-packages/instana/instrumentation/boto3_inst.py", line 71, in make_api_call_with_instana
    result = wrapped(*arg_list, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/venv/lib/python3.11/site-packages/botocore/client.py", line 1021, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: "claude-3-sonnet-20240229" is not supported on this API. Please use the Messages API instead.

The response for the request to chat:

{
  "detail": "The chat route is not implemented for Amazon Bedrock models."
}

Other info / logs

REPLACE_ME

What component(s) does this bug affect?

What interface(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

WeichenXu123 commented 1 month ago

Thanks! Could you file a PR and attach testing result ? I don't have Bedrock account token to test it.

WeichenXu123 commented 1 month ago

CC @andrew-christianson Would you help checking this ? Thx!

Isydmr commented 1 month ago

@andrew-christianson, I am encountering the same issue.

Completion endpoints are getting deprecated (see Anthropic, OpenAI).

Could you please add support for the chat endpoint for Bedrock?

github-actions[bot] commented 1 month ago

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

awolf0 commented 1 month ago

Hi guys, any updates on this issue or comments on the PR? This is my first contribution to mlflow, so please treat it as such, and I would appreciate any guidance :-)

elisevansbbfc commented 4 weeks ago

I am getting the same sort of issue when using claude-3-5-sonnet-20240620. When I try to create a new run with this endpoint I get:

MLflow deployment returned the following error: "Deployments proxy request failed with error code 400. Error message: {"detail":"\"claude-3-5-sonnet-20240620\" is not supported on this API. Please use the Messages API instead."}"

Is there any work-around in the meantime so that I can still use Claude 3.5 Sonnet with MLFlow?