lapp0 / distily

Distily: Language Model Distillation Toolkit and Library
GNU Affero General Public License v3.0
2 stars 0 forks source link
bitnet distillation knowledge-distillation language-model transformer

Distily

In one command, distill an existing LLM into a smaller or different architecture.

Install

pip install -U "git+https://github.com/lapp0/distily.git#egg=distily[full]"

Features

Distily allows you to distill a model with

Usage

Minimal Example: distily_gpt2

Command to create a distilled gpt2 with only 6 layers:

python3 -m distily.run \
    --teacher_model_name_or_path gpt2 \
    --output_dir distily_gpt2 \
    --hub_model_id "distily/distily_gpt2" \
    --push_to_hub True \
    --student_model_config {"n_layers": 6} \
    --student_model_as_bitnet True

The Resulting distily_gpt2 Model has (TODO: explain metrics).

For more examples, review the Examples documentation.

Note on Hub Credentials

To push to hub, you must prepare your hub token

HF_WRITE=<your hub token> python3 -c "from huggingface_hub.hf_api import HfFolder; HfFolder.save_token('${HF_WRITE}')"

Further Reading

TODO: commit the linked docs once complete

Using Distily

Available Models

Contributing

Roadmap

Improved performance / sampling efficiency:

Distill to a different model shape / size:

Distill to a different architecture:

Additional Techniques: