run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.37k stars 4.98k forks source link

[Question]: ValueError: The `model_name` argument must be provided #15353

Open youngjae2424 opened 1 month ago

youngjae2424 commented 1 month ago

Question Validation

Question

I don't use huggingface because of proxy issue. So, I tried to make local embedding as follows

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.llms.llama_cpp import LlamaCPP

llm = LlamaCPP(
    model_path="models/llama-3-Korean-Bllossom-8B.Q2_K.gguf",
)

documents = SimpleDirectoryReader("./data/").load_data()

service_context = ServiceContext.from_defaults(llm=llm, embed_model='local')
index = VectorStoreIndex.from_documents(documents, service_context=service_context)  

#Query your data
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

But, I got the error as follows. How can it be resolved?

ValueError                                Traceback (most recent call last)
Cell In[15], line 10
      4 llm = LlamaCPP(
      5     model_path="models/llama-3-Korean-Bllossom-8B.Q2_K.gguf",
      6 )
      8 documents = SimpleDirectoryReader("./data/").load_data()
---> 10 service_context = ServiceContext.from_defaults(llm=llm, embed_model='local')
     11 index = VectorStoreIndex.from_documents(documents, service_context=service_context)  
     13 #Query your data

File /opt/conda/lib/python3.10/site-packages/deprecated/classic.py:285, in deprecated.<locals>.wrapper_function(wrapped_, instance_, args_, kwargs_)
    283 else:
    284     warnings.warn(msg, category=category, stacklevel=_routine_stacklevel)
--> 285 return wrapped_(*args_, **kwargs_)

File /opt/conda/lib/python3.10/site-packages/llama_index/core/service_context.py:200, in ServiceContext.from_defaults(cls, llm_predictor, llm, prompt_helper, embed_model, node_parser, text_splitter, transformations, llama_logger, callback_manager, system_prompt, query_wrapper_prompt, pydantic_program_mode, chunk_size, chunk_overlap, context_window, num_output, chunk_size_limit)
    195         llm_predictor.query_wrapper_prompt = query_wrapper_prompt
    197 # NOTE: the embed_model isn't used in all indices
    198 # NOTE: embed model should be a transformation, but the way the service
    199 # context works, we can't put in there yet.
--> 200 embed_model = resolve_embed_model(embed_model)
    201 embed_model.callback_manager = callback_manager
    203 prompt_helper = prompt_helper or _get_default_prompt_helper(
    204     llm_metadata=llm_predictor.metadata,
    205     context_window=context_window,
    206     num_output=num_output,
    207 )

File /opt/conda/lib/python3.10/site-packages/llama_index/core/embeddings/utils.py:110, in resolve_embed_model(embed_model, callback_manager)
    107     cache_folder = os.path.join(get_cache_dir(), "models")
    108     os.makedirs(cache_folder, exist_ok=True)
--> 110     embed_model = HuggingFaceEmbedding(
    111         model_name=model_name, cache_folder=cache_folder
    112     )
    113 except ImportError:
    114     raise ImportError(
    115         "`llama-index-embeddings-huggingface` package not found, "
    116         "please run `pip install llama-index-embeddings-huggingface`"
    117     )

File /opt/conda/lib/python3.10/site-packages/llama_index/embeddings/huggingface/base.py:151, in HuggingFaceEmbedding.__init__(self, model_name, tokenizer_name, pooling, max_length, query_instruction, text_instruction, normalize, model, tokenizer, embed_batch_size, cache_folder, trust_remote_code, device, callback_manager, parallel_process, target_devices, **model_kwargs)
    147         raise ValueError(
    148             f"{variable} is deprecated. Please remove it from the arguments."
    149         )
    150 if model_name is None:
--> 151     raise ValueError("The `model_name` argument must be provided.")
    153 self._model = SentenceTransformer(
    154     model_name,
    155     device=self._device,
   (...)
    164     **model_kwargs,
    165 )
    166 if max_length:

ValueError: The `model_name` argument must be provided.
dosubot[bot] commented 1 month ago

Hey there, @youngjae2424! 👋 I'm here to help you out while you wait for a human maintainer. I'm a bot that can assist with solving bugs, answering questions, and even becoming a contributor.

To resolve the error "ValueError: The model_name argument must be provided" and successfully create local embeddings without using Hugging Face, you can use the resolve_embed_model function from the LlamaIndex library. This function allows you to specify different embedding models, including local ones. Here is an example of how to do this:

from llama_index.legacy.embeddings.utils import resolve_embed_model

# Set up the embedding model with the required model to use with llamaindex core.
embed_model = resolve_embed_model(embed_model="local:BAAI/bge-small-en")

# Or if you want to embed some text separately
embeddings = embed_model.get_text_embedding("I want to embed this text!")

In your case, you need to modify the ServiceContext creation to use a specific local embedding model. Here is how you can adjust your code:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.llms.llama_cpp import LlamaCPP
from llama_index.legacy.embeddings.utils import resolve_embed_model

llm = LlamaCPP(
    model_path="models/llama-3-Korean-Bllossom-8B.Q2_K.gguf",
)

documents = SimpleDirectoryReader("./data/").load_data()

# Set up the embedding model
embed_model = resolve_embed_model(embed_model="local:BAAI/bge-small-en")

# Create the ServiceContext with the custom embedding model
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)

index = VectorStoreIndex.from_documents(documents, service_context=service_context)  

# Query your data
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

This should resolve the error and allow you to use local embeddings without relying on Hugging Face [1].

To continue talking to Dosu, mention @dosu.

youngjae2424 commented 1 month ago

I have executed your code and it seems to be trying to connect to the following link: https://huggingface.co/. Is there any way to do this without using HuggingFace embeddings?

OSError: We couldn't connect to 'https://huggingface.co/' to load this file, couldn't find it in the cached files and it looks like BAAI/bge-small-en is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
dosubot[bot] commented 1 month ago

Yes, you can create local embeddings without using Hugging Face embeddings and without requiring internet access. LlamaIndex provides an easy-to-extend base class for implementing your own embeddings. Here is an example of how you can create a custom embedding model:

from typing import Any, List
from llama_index.core.base.embeddings.base import BaseEmbedding

class CustomLocalEmbedding(BaseEmbedding):
    def __init__(self, **kwargs: Any) -> None:
        super().__init__(**kwargs)
        # Initialize your local embedding model here

    def _get_query_embedding(self, query: str) -> List[float]:
        # Implement your logic to generate embeddings for the query
        return [0.0] * 768  # Example: return a dummy embedding

    def _get_text_embedding(self, text: str) -> List[float]:
        # Implement your logic to generate embeddings for the text
        return [0.0] * 768  # Example: return a dummy embedding

    def _get_text_embeddings(self, texts: List[str]) -> List[List[float]]:
        # Implement your logic to generate embeddings for a list of texts
        return [[0.0] * 768 for _ in texts]  # Example: return dummy embeddings

    async def _get_query_embedding(self, query: str) -> List[float]:
        return self._get_query_embedding(query)

    async def _get_text_embedding(self, text: str) -> List[float]:
        return self._get_text_embedding(text)

You can then set this custom embedding model in the Settings object:

from llama_index.core import Settings

Settings.embed_model = CustomLocalEmbedding()

Additionally, you can use pre-trained local models by resolving them with the resolve_embed_model function. Here is an example of how to create local embeddings using pre-trained models:

from llama_index.embeddings import resolve_embed_model

def create_local_embedding(model_name: str) -> BaseEmbedding:
    model = resolve_embed_model(f"local:{model_name}")
    return model

# Example usage
local_embedding_model = create_local_embedding("sentence-transformers/all-MiniLM-L6-v2")

For more detailed guidance, you can refer to the Custom Embeddings guide provided by LlamaIndex. This way, you can use your local embedding model without requiring internet access or connecting to Hugging Face [1][2][3].

To continue talking to Dosu, mention @dosu.