clp-research / clembench

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark
MIT License
19 stars 26 forks source link

[backends] Consolidate huggingface backends #28

Closed davidschlangen closed 5 months ago

davidschlangen commented 7 months ago

For historical reasons, there a separate backends for llama-based models (via huggingface transformers) and other huggingface models. This should be consolidated into a single backend.

Gnurro commented 7 months ago

Since the resulting backend would end up with a lot of model specifics, moving those into an external file (JSON/YAML) might prevent the code from being convoluted. However, this entails additional complexity from the specifics file content scheme and its loading, but could be useful for more specific future uses of transformers (quantization, LoRA, etc). This is for the to-be-merged HF backend, and not as general as suggested in https://github.com/clp-research/clembench/issues/26 .

davidschlangen commented 7 months ago

Don't worry about this, I have a plan for the external model registry. (Which would need to be handled centrally.) But of course that solution would need to cover all backends -- providing to each backend the information that it needs.

sherzod-hakimov commented 6 months ago

possible candidates:

Gnurro commented 6 months ago

What do these offer over our own implementation? Adding, for example, the model-dependent context size handling in a way that clembench games can easily interact with is easy to add to what we have, but might take a lot more work to integrate into one of these frameworks. The performance gains some of these frameworks offer can also be added to our HF backend - with the same model support, since the more performance is optimized, the fewer models can make use of it. The gains of these frameworks are due to tailoring the inference to each of their supported models. Relying on one of these frameworks also means being dependent on that framework and it's updates. We can't just add models - we'll have to wait for the framework to integrate them, and they might simply not implement what we want to test.

davidschlangen commented 6 months ago

I think this (@sherzod-hakimov 's comment) ended up in the wrong issue? The question of whether to integrate local API providers, and which ones, is orthogonal to the present issue, which is about code consolidation. The issue stands as is.

Gnurro commented 5 months ago

Now done as part of the major refactor/enhancement of the huggingface backend: https://github.com/Gnurro/clembench/commit/64dad73a4ce650f5f7beee4bfcdf2569d9cfc197

Gnurro commented 5 months ago

Solved with https://github.com/clp-research/clembench/pull/37