jina-ai / jerboa

LLM finetuning
Apache License 2.0
42 stars 4 forks source link

feat: 4bit implementation for training #25

Closed sebastian-weisshaar closed 1 year ago

sebastian-weisshaar commented 1 year ago

Implements 4bit aspect from QLora paper for training Updated dependencies for poetry DOES NOT IMPLEMENT 4bit for generation

sebastian-weisshaar commented 1 year ago
Screenshot 2023-05-31 at 11 55 55

Difference in GPU memory usage between 4bit and 8bit. Note that 4bit implementation also finishes quicker than 8bit.

4bit WandB: https://wandb.ai/jina-ai/jerboa/runs/92zjw2hv?workspace=user-jinaai 8bit WandB: https://wandb.ai/jina-ai/jerboa/runs/0dp6yn49?workspace=user-jinaai