Open torvalds-dev opened 12 months ago
torvalds.dev is analyzing the ticket
torvalds.dev is analyzing the ticket
The issue is a feature request for the integration of Nvidia Triton TensorRT LLMs (Language Model) into the llama index. The llama index currently supports several other LLM endpoints, and the user believes that Nvidia's Triton LLM endpoints would be a valuable addition.
The reason for this request is that there are already several implementations of this feature available on the internet. However, these require end users to manually piece together the solution and build it from the source. The proposed feature would eliminate this need, making it easier for users to utilize the llama index.
The value of this feature lies in the new LLM endpoints it would provide. These would give users more options and control over how they use the llama index, potentially allowing them to keep their data under their own control.
To address this issue, the following actions could be taken:
The relevant files for this issue are:
llama_index/llms/llama_cpp.py
: This file contains the implementation of the LlamaCPP class, which is a custom LLM. It seems to be the main file where the integration of Nvidia Triton TensorRT LLMs would take place.
llama_index/langchain_helpers/agents/__init__.py
: This file contains the initialization of the llama integration with Langchain agents. It might be relevant if the integration of Nvidia Triton TensorRT LLMs involves changes to the way llama interacts with Langchain agents.
Feature Description
I would like to add support for Nvidia Triton TensorRT LLMs in llama index. There is currently support for several other LLM endpoints and Nvidia has several interesting offerings with their Triton LLM endpoints that I think others would find useful in llama_index.
Reason
There are several implementations of this floating around the internet already. However, end users must "hack" together the solution and build from source. This will allow users to not have to go through those efforts.
Value of Feature
New LLM endpoints that give users more options and control over how they can use llama index and potential keep their data under their own control.