abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
8.21k stars 978 forks source link

Allow python packages to contribute to LlamaChatCompletionHandlerRegistry #1715

Open axel7083 opened 3 months ago

axel7083 commented 3 months ago

Is your feature request related to a problem? Please describe.

Today, the [llama-cpp/llama_chat_format.py] contains 25 chat format, and 4 chat_completion_handler, this currently force the different actors to contribute to this never ending growing file.

This is the case for the functionary models, which has to keep updating the handlers to support their newer models.

This process can be slower than their pace of release since they have to get approval on this repository, the amazing people behind the functionary models have a repository with the necessary code to transform the generated content to proper CreateChatCompletionStreamResponse, and it would make sense that this would be their responsibility.

Describe the solution you'd like

python (>3.3) offers a lot of way to load code from other packages, or packages to contribute to a main packages. This would have a lot of advantages, as model provider could maintain their own packages, and rely on their own testing/versioning.

Additional context Add any other context or screenshots about the feature request here.

axel7083 commented 2 months ago

Hey @abetlen ! Do you have any opinion on this matter ?