Closed davidschlangen closed 5 months ago
Since the resulting backend would end up with a lot of model specifics, moving those into an external file (JSON/YAML) might prevent the code from being convoluted. However, this entails additional complexity from the specifics file content scheme and its loading, but could be useful for more specific future uses of transformers
(quantization, LoRA, etc).
This is for the to-be-merged HF backend, and not as general as suggested in https://github.com/clp-research/clembench/issues/26 .
Don't worry about this, I have a plan for the external model registry. (Which would need to be handled centrally.) But of course that solution would need to cover all backends -- providing to each backend the information that it needs.
What do these offer over our own implementation? Adding, for example, the model-dependent context size handling in a way that clembench games can easily interact with is easy to add to what we have, but might take a lot more work to integrate into one of these frameworks. The performance gains some of these frameworks offer can also be added to our HF backend - with the same model support, since the more performance is optimized, the fewer models can make use of it. The gains of these frameworks are due to tailoring the inference to each of their supported models. Relying on one of these frameworks also means being dependent on that framework and it's updates. We can't just add models - we'll have to wait for the framework to integrate them, and they might simply not implement what we want to test.
I think this (@sherzod-hakimov 's comment) ended up in the wrong issue? The question of whether to integrate local API providers, and which ones, is orthogonal to the present issue, which is about code consolidation. The issue stands as is.
Now done as part of the major refactor/enhancement of the huggingface backend: https://github.com/Gnurro/clembench/commit/64dad73a4ce650f5f7beee4bfcdf2569d9cfc197
For historical reasons, there a separate backends for llama-based models (via huggingface transformers) and other huggingface models. This should be consolidated into a single backend.