Support cross-region inference for Amazon Bedrock

dlqqq commented 1 week ago

Description

Cross-region inference (CRI) allows requests to be automatically routed within any set of regions, which mitigates restrictions imposed by service quotas or peak usage times.

CRI is also required to use some models on Amazon Bedrock, notably Llama 3.2. A previous attempt at implementing Llama 3.2 support in Amazon Bedrock was stalled due to lack of existing support for CRI: #1014

Proposed solution

Jupyter AI needs to provide some user interface for supporting CRI. Tentatively, our proposal is to:

Implement a new dropdown field feature that allows for one option to be selected out of multiple.
Use this dropdown field in the Amazon Bedrock provider to allow users to specify a region area. Region areas include: us, us-gov, eu, apac.
- Ideally, this field should only appear on models that support CRI.
- List of supported regions & models for inference profiles: https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html
Prepend the region area to the model ID to produce an inference profile ID in the format <region-area>.<model-id>. When passed to Bedrock APIs, this allows for CRI and allows for usage of Llama 3.2 models on Amazon Bedrock.

dlqqq commented 5 days ago

After some discussion with @ellisonbg, it seems to make more sense to always default to using CRI in the "us" region area if it is available. This removes the need for additional user input in specifying the region area, and removes the need to handle edge cases of a model supporting CRI in some region areas but not others.

This change will allow models available through CRI to be used from any region. I'll update #1113 accordingly.

dlqqq commented 5 days ago

We received some valuable feedback from other stakeholders. We concluded that we can't default to the "us" region area as it may violate data residency laws set by GDPR in the EU. Furthermore, having a simple global dropdown for the region area is a poor user experience, as not all models support CRI, and models which support CRI are not necessarily available in all CRI region areas.

Given that this effort will take longer than we had originally estimated, and the fact that v3 development shouldn't be delayed any longer, I will move this issue to the v3 milestone for future work.

As a short-term fix, we will recommend users use the "Bedrock (custom/provisioned)" provider and type the inference profile ID manually to use CRI. I will open a new issue for this.

jupyterlab / jupyter-ai

Support cross-region inference for Amazon Bedrock #1114

Description

Proposed solution