CERC-AAI / multimodal

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Apache License 2.0
8 stars 2 forks source link

Pythia checkpoint loading #3

Open kshitijkg opened 1 year ago

kshitijkg commented 1 year ago

https://github.com/floatingsnake/gpt-neox/issues/4