CERC-AAI / multimodal

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Apache License 2.0
8 stars 3 forks source link

Fixed AnnealingLR Class and Cosine Decay Schedule (#1008) #42

Closed kshitijkg closed 1 year ago

kshitijkg commented 1 year ago