pretraining Search Results

instadeepai/nucleotide-transformer #81

pretraining the nucleotide-transformer

Hi, I would like to pre-train the model myself to gain a better understanding of machine learning models. Specifically, **could you provide the code that was used to pre-train the v2 500m multi-sp…

mpage21 updated 3 days ago

hgaurav2k/hop #7

The pretraining and finetuning time

Hello, I am very curious about how long the pretraining will take? I run the finetuning on two 4090 with 10w epoch, which takes almost three days. What types of GPU do you use? The 10w epoch finetu…

1024AILab updated 1 day ago

huggingface/smollm #5

Request for SmolMLv1 Pretraining Code

Hi, thank you for the amazing work! I'm interested in reproducing the pretraining process of SmolMLv1 (135M) on the SmolLM-Corpus. However, I noticed that the repository currently only includes fine-t…

JT-Ushio updated 7 hours ago

birkhoffkiki/GPFM #7

Questions regarding pretraining.

Hi, Thanks for the great work. It seems like the pretraining takes a long time. I would like to run the pretraining but I cannot submit a job for such a long time. I was wondering if it is possible…

boqchen updated 2 days ago

mbzuai-nlp/ArTST #9

Pretraining and fine-tuning instructions

Hi! I was trying to finetune the model on my dataset but I couldn't understand how I should structure my dataset. I've performed all tasks mentioned in [data preparation](https://github.com/mbz…

siin-lab updated 8 hours ago

ultralytics/ultralytics #17586

Continual Pretraining of YOLO World

### Search before asking - [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…

toobatehreem updated 1 week ago

VishnuPJ/MalayaLLM #3

Pretraining model

Hi, its a wonderful repository, I have a doubt. I'm new to this.. how did you pretrain the llama2 base model. because malayalam is not trained in the base model right ?, its only trained on english to…

gbs-ai updated 1 day ago

zhu-xlab/SSL4EO-S12 #29

Weights of Multimodal Pretraining

Hi, is it possible to also upload the training scripts and resulting network weights for the multimodal configuration? (Training on both Optical and Radar data with RandomSensorDrop)

mstricker13 updated 3 weeks ago

westlake-repl/SaProt #67

code for pretraining

Is it possible to share pretraining code?

done520 updated 1 month ago

karpathy/nanoGPT #554

Pretraining loss explosion

I have been trying to get this repo working for several months, but my loss keeps exploding between 30k and 100k iterations. I have tried many things: Turn flash attention off ( based on this i…

mattgorb updated 5 days ago

1000+ results for pretraining

1000+ results
for pretraining