run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.65k stars 5.25k forks source link

[Bug]: Inconsistent Behavior with Anthropic Model Names #7361

Closed noble-varghese closed 1 year ago

noble-varghese commented 1 year ago

Bug Description

I'm encountering an inconsistency while working with the Anthropic integration in llama_index. Specifically, when passing the model name as claude-instant-1.2, I'm getting an error message:

Code used :

from llama_index import ServiceContext
from llama_index.llms import Anthropic

anthropic = Anthropic(model="claude-instant-1.2")

service_context = ServiceContext.from_defaults(
    llm=llm
)

Error

Unknown model: claude-instant-1.2. Please provide a valid Anthropic model name. Known models are: claude-instant-1, claude-2.

However, according to the documentation, both claude-instant-1 and claude-instant-1.2 should be valid model names. This inconsistency is causing confusion and making it challenging to work with the Anthropic LLM integration.

Furthermore, I noticed that the results from the Anthropic LLM (api results) contain the model in the full version as stated in the documentation, but the error message suggests otherwise.

Link to documentation: https://docs.anthropic.com/claude/reference/selecting-a-model

Request to Anthropic:

curl --location 'https://api.anthropic.com/v1/complete' \
--header 'accept: application/json' \
--header 'anthropic-version: 2023-06-01' \
--header 'content-type: application/json' \
--header 'x-api-key: <_ANTHROPIC_API_KEY_>' \
--data '
{
  "model": "claude-2",
  "prompt": "\n\nHuman: Hello, world!\n\nAssistant:Why is the sky blue?",
  "max_tokens_to_sample": 256
}
'

Example response from Anthropic:

{
    "completion": " Well, it has to do with how sunlight interacts with molecules in Earth's atmosphere. Here's a more detailed explanation:\n\nThe sky appears blue because of a phenomenon called Rayleigh scattering. As sunlight enters the atmosphere, it interacts with gas molecules like nitrogen and oxygen. These molecules scatter the shorter wavelengths of sunlight (violet and blue light) more than the longer wavelengths (like red light). \n\nThe shorter blue and violet wavelengths get scattered in all directions, making the sky appear blue from the ground no matter which direction you look. Meanwhile, the remaining longer wavelengths like red and orange pass through the atmosphere more directly, which is why sunsets appear more reddish.\n\nThere are a few other factors that impact the exact shade of blue:\n\n- The size of the molecules - smaller molecules scatter blue light more efficiently than larger ones. Nitrogen and oxygen molecules are just the right size to preferentially scatter blue.\n\n- The density of the atmosphere - more dense atmospheres with more gas molecules lead to more scattering and a deeper blue color. This is why the sky fades to darker blue higher up in the sky.\n\n- Dust and pollution - additional particles in the air from dust, smoke, or pollution can add extra scattering, making",
    "stop_reason": "max_tokens",
    "model": "claude-2.0",
    "stop": null,
    "log_id": "b6e7d19d49a1433cac79c8512ad6818c8cd9a94f28930aa7a613fd46cb"
}

Infact Anthropic API accepts both formats of inputs.

Version

0.8.4

Steps to Reproduce

Use the above specified code

Relevant Logs/Tracbacks

None
dosubot[bot] commented 1 year ago

🤖

Hello,

Thank you for bringing this to our attention. The issue you're experiencing seems to be due to the fact that the model name "claude-instant-1.2" is not included in the CLAUDE_MODELS dictionary in the anthropic_modelname_to_contextsize function. This function checks if the provided modelname is in the CLAUDE_MODELS dictionary and if it is not, it raises a ValueError stating that the model is unknown.

Here is the relevant code:

CLAUDE_MODELS = {
    "claude-instant-1": 100000,
    "claude-2": 100000,
}

def anthropic_modelname_to_contextsize(modelname: str) -> int:
    if modelname not in CLAUDE_MODELS:
        raise ValueError(
            f"Unknown model: {modelname}. Please provide a valid Anthropic model name."
            "Known models are: " + ", ".join(CLAUDE_MODELS.keys())
        )

    return CLAUDE_MODELS[modelname]

To resolve this issue, "claude-instant-1.2" needs to be added to the CLAUDE_MODELS dictionary. However, as a user, you won't be able to make this change. We will need to update the LlamaIndex codebase to include this model name.

I will raise this issue with the development team and we will work on a fix. Once the fix is implemented, it will be included in the next release of LlamaIndex. I will update this thread once the fix is available.

In the meantime, you can use the model names "claude-instant-1" and "claude-2" as these are currently recognized by the LlamaIndex's Anthropic integration.

Thank you for your patience and understanding.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

logan-markewich commented 1 year ago

Easy fix, thanks!