-
Hello, would it be possible to also release the pretraining dataset ( used for TSmixup), and maybe a mention of a successful training recipe.
I would like to try to pretrain from scratch as well, …
-
Thank you for sharing a good paper.
In the paper, if you look at the result of the experiment in Fig 6,
there is a vision only test result,
and it was said that the performance would be improved…
-
In the Masked Pretraining section, there seems to be an issue with the way the CLIP model is loaded. In the extract.ipynb notebook, the code model, _ = clip.load("ViT-B/16", device='cpu') is used, but…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
**Packages:**
llamafactory 0.8.1.dev0
Transformers 4.41.2
Pytorch 2.2.0+cu121
Datasets 2.19.…
-
In Table 3 & 4, is the same dataset used during pre-training and fine-tuning? Or does the fine-tuning only happened on ImageNet-1k dataset?
-
Hi,
For the paper https://arxiv.org/pdf/2310.01218.pdf , the following is mentioned in pretraining section :
```
For efficiency, we first train SEED-LLaMA using LoRA [32] tuning and together o…
-
I attached my training loss below, the data we are using refers to LLM360's paper, we use less data starcode.
For each training epoch our data contains 30B arxiv , Book 57B, C4 197.67B, Refined-Web 6…
-
**Replace:**
Pretraining data consists of thousands, or even millions, of individual documents, often web scraped. Model knowledge and behavior will likely reflect a compression of this information…
-
I was following [DATA.md](https://github.com/OpenGVLab/Ask-Anything/blob/main/video_chat2/DATA.md) to download pretraining dataset.
However, I cannot find `webvid_10m_train.json`, `cc12m_train.json…
-
The pretraining example with
```
litgpt pretrain \
--model_name pythia-14m \
--config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/pretrain/debug.yaml
```
is do…