-
Hi authors, thank you for this impressive work.
Is it possible to provide a pretraining script and a small sample of the processed data used for pretraining? I would like to try pretraining a model…
-
Hi,
I am looking for ImageNet pretrained weights of the YOLOX backbone. I am specifically interested in the largest model YOLOX-x. In a couple of other issues I've seen that nano version can be train…
-
The paper claims that you trained for 5 epochs on 32 A100 GPUs. Do you have an estimate of how much time it took? And also was it a 40GB A100 or 80 GB?
-
Hello!
I am trying to perform continued pretraining on the mbart.cc.25 pretrained checkpoint using the multilingual denoising objective. However, I am not sure how to prepare and pre-process the da…
-
### 🚀 The feature, motivation and pitch
Often used in pretraining of LMs for stabilization, i.e. the recent [Chameleon](https://arxiv.org/abs/2405.09818) & [PaLM](https://www.jmlr.org/papers/v24/22-1…
Fr0do updated
2 weeks ago
-
I tried to perform finetuning using custom dataset on a model I continuously pretrained on another custom dataset but the following error occurs:
<
Is there any way to streamline these two pro…
-
I think there could be value in creating a separate dataset for pretraining. It would cover the same chemical space as the standard SPICE dataset, but have many more conformations and be computed at …
-
![image](https://github.com/user-attachments/assets/fc953993-f166-4d3c-a7c8-00369eb8da92)
![image](https://github.com/user-attachments/assets/2cd34450-b1b6-40bc-835b-a559bbd00ce5)
\[
\mathbf{h}^…
-
Hi, thx for your work! Do you plan to release the pretraining code? Like training dataset.
-
Hello,
Thank you for providing these valuable recipes. I appreciate your work.
I'm interested in **further pre-training the Llama3.1-8B-base model rather than using the instruct version**. To ensure…