-
### Discussion
Thanks for your great work for LLaVA Lightning. I noticed that you used LAION/CC/SBU BLIP-Caption Concept-balanced 558K instead of the previously used CC-3M Concept-balanced 595K. Whi…
-
Hi, I have some questions about pre-training as follows:
1. I wanna train my own model from scratch and produce the `vocab.txt` by characters. There are some low-frequency words, should low-frequenc…
-
"Maximum input token count 4919 exceeds limit of 4096 for train data" in model-customization-job/amazon.titan-text-lite-v1:0:4k/nhjsh25oes0i in notebook 03_Model_customization/03_continued_pretraining…
-
## ❓ Questions and Help
I am trying to pretrain wav2vec2 on persian language using common voice dataset. I did not modify anything but the dataset path in configs. here is plots of training metrics…
-
when pretraining gpt with triton flash attention loss blows up (from ~2 to 7) halfway into the training and doesn't go down anymore. If i resume from a healthy ckpt without Flash attention the loss is…
-
Hello, thanks for your contributory work! I find that there isn't a `paper_train.csv` in the `data_csv.zip`. Is the paper path in this csv file the same as the PMC-Inline text json file from your hugg…
-
Hello!
Thank you so much for developing and releasing this model to the public. As a native Arabic speaker, I highly appreciate your efforts in enriching our beautiful language.
I have the follo…
-
How to process our para data with the provided BPE codes. I ran the fastBPE tools, and got some problems. The privided BPE codes has two columns, fastBPE need three columns. Could you give some advice…
-
HI, thanks you for your work.
I have been trying to finetune yoloworld with my close-set custom datasets. Two totally different datasets, which all have class_num>25 and a scale of more than 10k. …
-
A relatively simple question, that I couldn't quite clarify by looking through the tech report...
During your pretraining (report section 3.1) or instruction tuning phases (report section 3.2), any…