CERC-AAI / multimodal

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Apache License 2.0
8 stars 2 forks source link

LORA V0 #32

Closed kshitijkg closed 8 months ago

kshitijkg commented 1 year ago

Implement and test LORA

Deliverable: LORA loss curve vs non LORA loss curve for 410M Pythia Model