Open heeju-kim2 opened 5 days ago
Revisiting BFloat16 Training (2021, sambanova)
Doubling Neural Network Finetuning Efficiency with 16-bit Precision Techniques
Finetuning LLMs with LoRA and QLoRA: Insights from Hundreds of Experiments
Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training (https://arxiv.org/abs/1909.06842)
llama2 3runs wnadb: https://wandb.ai/debiasing/llama_recipes?nw=nwuserbig_whale
mixed : use_fp16 = False, mixed_precision = True fp16 : use_fp16 = True, mixed_precision = False fp32 : use_fp16 = False, mixed_precision = False
Reference code