Multi-Domain Expert Learning
To set up the development environment, run make setup_dev
. This will setup the
pre-commit hooks.
First, make sure you followed the Environment Setup guidelines.
To create an expert dataset using the Pile data, follow these steps:
./scripts/get_pile_shard1_data.sh
SUBSET_NAME
in
scripts/create_domain_pile_mix.sh
. This should be set to a valid value of
the Pile's variable pile_set_name
. A list of valid values can be found
below.export HF_ACCESS_TOKEN={YOUR HUGGINGFACE TOKEN}
scripts/upload_to_hf.sh
export HUGGING_FACE_HUB_TOKEN=[FILL ME]
export WANDB_API_KEY=[FILL ME]
DATASET
in script src/mdel/train.sh
to match a valid
dataset name on the
MDEL HF../train.sh &
export HUGGING_FACE_HUB_TOKEN=[FILL ME]
python src/mdel/merge_experts.py \
--hf-repo your_hf_username/desired_name_of_merged_model \
-e mdel/expert_1 \
-e mdel/expert_2 \
-e mdel/expert_n
export HUGGING_FACE_HUB_TOKEN=[FILL ME]
python3 src/mdel/calculate_perplexity.py \
--model Multi-Domain-Expert-Layers/expert-arxiv \
--dataset Multi-Domain-Expert-Layers/arxiv \
--split validation_domain
Gao, L., Biderman, S., Black, S., Golding, L., Hoppe, T., Foster, C., ... & Leahy, C. (2020).The pile: An 800gb dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027.