pretraining Search Results

1000+ results
for pretraining

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

haotian-liu/LLaVA #143

[Discussion] Pretraining Dataset

### Discussion Thanks for your great work for LLaVA Lightning. I noticed that you used LAION/CC/SBU BLIP-Caption Concept-balanced 558K instead of the previously used CC-3M Concept-balanced 595K. Whi…

gordonhu608 updated 1 year ago
3
google-research/bert #171

Questions about pretraining

Hi, I have some questions about pre-training as follows: 1. I wanna train my own model from scratch and produce the `vocab.txt` by characters. There are some low-frequency words, should low-frequenc…

htw2012 updated 5 years ago
4
aws-samples/amazon-bedrock-workshop #224

Maximum input token count 4919 exceeds limit of 4096 for tra…

"Maximum input token count 4919 exceeds limit of 4096 for train data" in model-customization-job/amazon.titan-text-lite-v1:0:4k/nhjsh25oes0i in notebook 03_Model_customization/03_continued_pretraining…

FireballDWF updated 1 week ago
5
facebookresearch/fairseq #4948

strange unstable wav2vec2 pretraining

## ❓ Questions and Help I am trying to pretrain wav2vec2 on persian language using common voice dataset. I did not modify anything but the dataset path in configs. here is plots of training metrics…

Mohammadtvk updated 10 months ago
2
triton-lang/triton #1957

pretraining loss blows up with Triton Flash attention

when pretraining gpt with triton flash attention loss blows up (from ~2 to 7) halfway into the training and doesn't go down anymore. If i resume from a healthy ckpt without Flash attention the loss is…

yzhang123 updated 2 months ago
4
chaoyi-wu/RadFM #34

the pretrain csv data for PMC-Inline

Hello, thanks for your contributory work! I find that there isn't a `paper_train.csv` in the `data_csv.zip`. Is the paper path in this csv file the same as the PMC-Inline text json file from your hugg…

zihui-debug updated 1 week ago
2
FreedomIntelligence/AceGPT #7

Size of data for continued pretraining

Hello! Thank you so much for developing and releasing this model to the public. As a native Arabic speaker, I highly appreciate your efforts in enriching our beautiful language. I have the follo…

AmgadHasan updated 4 months ago
2
microsoft/MASS #41

ZhEn pretraining model

How to process our para data with the provided BPE codes. I ran the fastBPE tools, and got some problems. The privided BPE codes has two columns, fastBPE need three columns. Could you give some advice…

Bournet updated 4 years ago
2
AILab-CVC/YOLO-World #409

Finetune as a close-set yolov8 detector

HI, thanks you for your work. I have been trying to finetune yoloworld with my close-set custom datasets. Two totally different datasets, which all have class_num>25 and a scale of more than 10k. …

mightycatty updated 1 week ago
2
LLM360/k2-train #1

Document boundaries/blocked attention?

A relatively simple question, that I couldn't quite clarify by looking through the tech report... During your pretraining (report section 3.1) or instruction tuning phases (report section 3.2), any…

jwkirchenbauer updated 1 month ago
1

上一页 1...14 15 16 17 18 19 20...100 下一页

1000+ results for pretraining

1000+ results
for pretraining