-
To enable efficient training on GPUs and scale our repository for models with millions to billions of parameters—essential for working with large visual language models—we must implement optimization …
-
### Question
是否有计划考虑将Bootstrap编译成native? native在应对突发流量需要扩容时启动快, 占用资源更小.
1. 很多依赖的框架并不支持
2. shenyu本身也有上传编译jar等native不能支持的功能
Is there a plan to consider compiling Bootstrap into native? Nativ…
-
### Describe the feature
I want to continue pre-training llama 2 70b using my own data. My data is about 1b tokens. I have read [Fine-tuning Llama 2 70B using PyTorch FSDP ](https://huggingface.co/bl…
-
## Current Implementation
Our `ResolveOperation` class uses a Union-Find (aka Disjoint Set Union) algorithm for grouping similar items efficiently. Here's how it works:
We've got two main data str…
-
### Description & Motivation
MeZO proposes a memory-efficient zeroth-order optimizer (MeZO), adapting the classical zeroth-order SGD method to operate in-place, thereby fine-tuning language models (L…
-
### 🚀 The feature, motivation and pitch
Fuyou Training Framework Integration for PyTorch
Description:
Integrate the Fuyou training framework into PyTorch to enable efficient fine-tuning of larg…
-
### Question
I finally managed to fine-tune LLaVA on a custom dataset (LLaVA-1.5-7b on Google Colab using a single A100 GPU)
The output I got was mostly an adapter_model.safetensors file (610MB) -- …
-
[PARAMETERS.txt](https://github.com/user-attachments/files/16852584/PARAMETERS.txt)
-
> Today we’re releasing the next step: QDoRA. This is just as memory efficient and scalable as FSDP/QLoRA, and critically is also as accurate for continued pre-training as full weight training. We thi…
-