Closed antoineross closed 8 months ago
Hello @antoineross! I'm Dosu, a bot here to assist you with any bugs, questions, or guidance on contributing while you're waiting for a human maintainer to join in. I'm eager to help out and ensure your experience with LlamaIndex is smooth. Let me take a moment to review your feature request about enhancing the IngestionPipeline with a custom API_Base and model compatibility. I'll get back to you with a detailed response shortly!
Just for more information in reproducing, this is how you can test it, currently it breaks because it maps the model to the contextsize which I believe can be set at a higher-level API such as in (OpenAI):
async def main():
OPENROUTER_API_KEY = os.environ["OPENROUTER_API_KEY"]
llm_model="mistralai/mixtral-8x7b-instruct:nitro"
llm = OpenAI(temperature=0.1, model=llm_model, api_base="https://openrouter.ai/api/v1", api_key=OPENROUTER_API_KEY, max_tokens=512)
text_splitter = TokenTextSplitter(
separator=" ", chunk_size=512, chunk_overlap=128
)
PINECONE_API_KEY = os.environ['SECRET_PINECONE_API_KEY']
PINECONE_INDEX_NAME = os.environ['SECRET_PINECONE_INDEX']
# Create a Pinecone client
pc = Pinecone(api_key=PINECONE_API_KEY)
pinecone_index = pc.Index(PINECONE_INDEX_NAME)
print(f"Connected to Pinecone index: {PINECONE_INDEX_NAME}")
# ---------------- Extractor Logic ---------------- #
print("Generating extractors, using LLM model: ", llm.model)
extractors = [
TitleExtractor(nodes=5, llm=llm),
QuestionsAnsweredExtractor(questions=3, llm=llm), #
EntityExtractor(prediction_threshold=0.5), # Entity extractor extracts lists of (persons, locations etc.). Uses BERT (free but uses your devices compute)
SummaryExtractor(summaries=["prev", "self"], llm=llm),
KeywordExtractor(keywords=10, llm=llm),
# CustomExtractor()
]
transformations = [text_splitter] + extractors
# ---------------- Directory Reading -> Processing Logic ---------------- #
uber_docs = SimpleDirectoryReader( #input_files=["documents/Ethical_Hacking_RHartleyTeachingStudents.pdf"]
input_dir="documents").load_data()
pipeline = IngestionPipeline(transformations=transformations)
workers = multiprocessing.cpu_count()
uber_nodes = await pipeline.arun(documents=uber_docs,
num_workers=workers,
show_progress=True,)
# ---------------- Generating Embeddings ---------------- #
# First save the metadata of the nodes in a JSON file
import json
# Create an empty list to store all the metadata
all_metadata = []
# Append the metadata of each node to the list
for node in uber_nodes:
all_metadata.append(node.metadata)
# Save all the metadata in a single JSON file
with open("llamaindex-output/10k-vFinalv1.json", "w") as f:
json.dump(all_metadata, f)
embed_model = OpenAIEmbedding(model="text-embedding-3-large")
for node in uber_nodes:
node_embedding = await embed_model.aget_text_embedding_batch(
node.get_content(metadata_mode="all")
)
node.embedding = node_embedding
vector_store = PineconeVectorStore(pinecone_index=pinecone_index, namespace="llama-index")
vector_store.add(uber_nodes)
if __name__ == "__main__":
asyncio.run(main())
—— Error:
Traceback (most recent call last):
File "/Users/antoineross/miniconda3/envs/llamaindex/lib/python3.11/asyncio/tasks.py", line 277, in __step
result = coro.send(None)
^^^^^^^^^^^^^^^
File "/Users/antoineross/miniconda3/envs/llamaindex/lib/python3.11/site-packages/tqdm/asyncio.py", line 76, in wrap_awaitable
return i, await f
^^^^^^^
File "/Users/antoineross/miniconda3/envs/llamaindex/lib/python3.11/site-packages/llama_index/core/async_utils.py", line 106, in worker
return await job
^^^^^^^^^
File "/Users/antoineross/miniconda3/envs/llamaindex/lib/python3.11/site-packages/llama_index/core/llms/llm.py", line 292, in apredict
if self.metadata.is_chat_model:
^^^^^^^^^^^^^
File "/Users/antoineross/miniconda3/envs/llamaindex/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 231, in metadata
context_window=openai_modelname_to_contextsize(self._get_model_name()),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/antoineross/miniconda3/envs/llamaindex/lib/python3.11/site-packages/llama_index/llms/openai/utils.py", line 198, in openai_modelname_to_contextsize
raise ValueError(
ValueError: Unknown model 'mistralai/mixtral-8x7b-instruct:nitro'. Please provide a valid OpenAI model name in: gpt-4, gpt-4-32k, gpt-4-1106-preview, gpt-4-0125-preview, gpt-4-turbo-preview, gpt-4-vision-preview, gpt-4-0613, gpt-4-32k-0613, gpt-4-0314, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0125, gpt-3.5-turbo-1106, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo-0301, text-davinci-003, text-davinci-002, gpt-3.5-turbo-instruct, text-ada-001, text-babbage-001, text-curie-001, ada, babbage, curie, davinci, gpt-35-turbo-16k, gpt-35-turbo, gpt-35-turbo-1106, gpt-35-turbo-0613, gpt-35-turbo-16k-0613
Task exception was never retrieved
future: <Task finished name='Task-4' coro=<tqdm_asyncio.gather.<locals>.wrap_awaitable() done, defined at /Users/antoineross/miniconda3/envs/llamaindex/lib/python3.11/site-packages/tqdm/asyncio.py:75> exception=ValueError("Unknown model 'mistralai/mixtral-8x7b-instruct:nitro'. Please provide a valid OpenAI model name in: gpt-4, gpt-4-32k, gpt-4-1106-preview, gpt-4-0125-preview, gpt-4-turbo-preview, gpt-4-vision-preview, gpt-4-0613, gpt-4-32k-0613, gpt-4-0314, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0125, gpt-3.5-turbo-1106, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo-0301, text-davinci-003, text-davinci-002, gpt-3.5-turbo-instruct, text-ada-001, text-babbage-001, text-curie-001, ada, babbage, curie, davinci, gpt-35-turbo-16k, gpt-35-turbo, gpt-35-turbo-1106, gpt-35-turbo-0613, gpt-35-turbo-16k-0613")>
Traceback (most recent call last):
The error you're encountering is due to the fact that the LlamaIndex currently does not recognize the 'mistralai/mixtral-8x7b-instruct:nitro' model. This is because the LlamaIndex has a predefined list of models it supports, and this model is not included in that list.
However, you can modify the LlamaIndex to recognize and use this model by creating a custom transformation. Here's a step-by-step guide on how to do this:
mistralai/mixtral-8x7b-instruct:nitro
model. This involves making HTTP requests to the API endpoint and handling the response.import requests
class MistralAIEmbedding(TransformComponent):
def __init__(self, api_base: str, api_key: str):
self.api_base = api_base
self.api_key = api_key
def transform(self, text: str) -> str:
response = requests.post(
f"{self.api_base}/transform",
json={"model": "mistralai/mixtral-8x7b-instruct:nitro", "text": text},
headers={"Authorization": f"Bearer {self.api_key}"}
)
response.raise_for_status()
return response.json()['transformed_text']
default_transformations
function or directly adding the custom transformation when constructing the PipelineCreate
object.def default_transformations(api_base: str, api_key: str) -> List[TransformComponent]:
"""Default transformations with custom MistralAI embedding."""
from llama_index.embeddings.openai import OpenAIEmbedding # pants: no-infer-dep
return [
SentenceSplitter(),
OpenAIEmbedding(),
MistralAIEmbedding(api_base=api_base, api_key=api_key), # Add the custom transformation
]
When creating the pipeline, ensure you pass the appropriate api_base
and api_key
for the custom API:
api_base = "https://custom-api.example.com"
api_key = "your_api_key_here"
pipeline_create = get_pipeline_create(
pipeline_name="YourPipelineName",
client=get_client(),
pipeline_type=PipelineType.YOUR_CHOICE,
transformations=default_transformations(api_base, api_key),
# other parameters as needed
)
This example demonstrates how to extend the ingestion pipeline to use a custom model by defining a custom transformation and integrating it into the pipeline creation process. Adjust the implementation details based on the actual API specifications and how the mistralai/mixtral-8x7b-instruct:nitro
model is exposed through the custom API.
For more information, you can refer to the LlamaIndex's handling of custom models in its current implementation, which is done by defining a CustomLLM
class that extends the LLM
base class. This class serves as an abstract base class for custom Language Learning Models (LLMs) and requires subclasses to implement specific methods. You can find more details in the llama-index-core/llama_index/core/llms/custom.py file.
I hope this helps! If you have any other questions or run into any issues, feel free to ask.
You can use the OpenAILike
llm class if your API is compatible with the openai client
pip install llama-index-llms-openai-like
from llama_index.llms.openai_like import OpenAILike
llm = OpenAILike(model="my model", api_key="fake", api_base="....")
Feature Description
There are wrapper API's that allow us to use openAI client using custom API_Base, see: https://openrouter.ai/models/mistralai/mistral-7b-instruct:nitro?tab=api
What I would like for the feature to have is to be able to use this feature for LLamaIndex's Ingestion Pipeline.
Reason
The reason is just so that the IngestionPipeline supports more models.
Value of Feature
I believe this is in line with LlamaIndex roadmap of inclusive of as many models as possible to the roster