-
Hi, I would like to pre-train the model myself to gain a better understanding of machine learning models.
Specifically, **could you provide the code that was used to pre-train the v2 500m multi-sp…
-
Hello, I am very curious about how long the pretraining will take? I run the finetuning on two 4090 with 10w epoch, which takes almost three days. What types of GPU do you use?
The 10w epoch finetu…
-
Hi, thank you for the amazing work! I'm interested in reproducing the pretraining process of SmolMLv1 (135M) on the SmolLM-Corpus. However, I noticed that the repository currently only includes fine-t…
-
Hi,
Thanks for the great work. It seems like the pretraining takes a long time. I would like to run the pretraining but I cannot submit a job for such a long time. I was wondering if it is possible…
-
Hi!
I was trying to finetune the model on my dataset but I couldn't understand how I should structure my dataset.
I've performed all tasks mentioned in [data preparation](https://github.com/mbz…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…
-
Hi, its a wonderful repository, I have a doubt. I'm new to this.. how did you pretrain the llama2 base model. because malayalam is not trained in the base model right ?, its only trained on english to…
-
Hi,
is it possible to also upload the training scripts and resulting network weights for the multimodal configuration? (Training on both Optical and Radar data with RandomSensorDrop)
-
Is it possible to share pretraining code?
-
I have been trying to get this repo working for several months, but my loss keeps exploding between 30k and 100k iterations.
I have tried many things:
Turn flash attention off ( based on this i…