Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6
This PR introduces the integration of the Mixtral-8x22B model into the codebase. Specifically, the following changes have been made:
Added the Mixtral model in mixtral.py to support the 8x22B architecture.
Introduced a new engine for the Mixtral model in mixtral_engine.py
Updated __init__.py files in the models and engines directories to register the new Mixtral model.
Modified generation_config.yaml to include parameters for the Mixtral model's generation tasks.
Updated finetuning_config.yaml to configure Mixtral model-specific parameters for finetuning.
Updated documentation, including README.md and supported_models.md, to reflect the addition of the Mixtral-8x22B model, with its identifier key as "mixtral".
Checklist
[x] Tested the integration of the Mixtral model with both generation and finetuning tasks.
[x] Updated documentation files to reflect the changes.
Additional Information
A new example script, mixtral.py, has been added in the examples directory. This script demonstrates how to use the Mixtral model with xTuring and provides instructions for testing the model.
These changes enable the Mixtral-8x22B model's functionality for both generation and finetuning tasks within the xTuring project. The model's parameters have been incorporated into the respective configuration files.
Summary
This PR introduces the integration of the Mixtral-8x22B model into the codebase. Specifically, the following changes have been made:
Checklist
Additional Information