CERC-AAI / multimodal

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Apache License 2.0
8 stars 3 forks source link

Add conversion script to HF #57

Open JACKHAHA363 opened 1 year ago

JACKHAHA363 commented 1 year ago

Requirement

Use the transformer version from https://github.com/JACKHAHA363/transformers/tree/robin_v4.32.1. This is based on the version 4.32.1 and it adds a Robin class here.

Usage

python tools/convert_module_to_hf.py \
    --input_dir checkpoints/robin_ckpt \
    --config_file checkpoints/robin_ckpt/configs/big_run_grid_8e-5_2208.yml \
    checkpoints/robin_ckpt/configs/magma_pythia_410M.yml \
    --output_dir checkpoints/robin_hf

Test

See python test_hf_robin.py for a sample inference In the --config_file you can input a list of config file names.