-
Is there a plan to support PEFT methods like LoRA training in maxtext to support larger model fine-tuning / continue pretraining so that bigger models like LLaMA-3-70B can be trainined even with small…
-
### Problem Description
On Llama3 70B Proxy Model, the training stalls & gpucore dumps. The gpucore dumps are 41GByte per GPU thus i am unable to send it. Probably easier for yall to reprod this er…
-
How to use multi card distributed training code
-
## Proposal Summary
Add a progress bar to the downloading of artifacts/models/large-files from all remote sources (S3/Azure buckets/GCP buckets/network-drives), consider doing so also for uploading (…
-
Collecting links to some issues on foundation models, pre-trained models, and related interface discussions.
### specific foundation models and marketplaces
* hugging face pretrained models http…
-
### Describe the bug
The SDXL model was fine-tuned using the rslora method and the training process was fine-tuned.
After the training, the Lora model was saved and then the lora model was reloaded …
-
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Feature Description
The work I propose involves:
Implementing key optimizers such as Stochastic Gradient De…
-
To train a mm_grodunding_dino ,we need to load both BERT and Swin two pre-trained models。
To fine-tune a mm_grounding_dino using my dataset, I need to load a pre-trained MM_Grounding_DINO and the con…
-
Will the code for fine-tuning the models be released?
Thank you for your excellent work.