pretraining Search Results

1000+ results
for pretraining

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/Oscar #131

Unable to Reproduce the results for VinVL+VIVO on NoCaps

I tried to reproduce the results for VinVL+VIVO+SCST on NoCaps, but my result was off by a visible margin. ### Reported Results on NoCaps validation set `"CIDEr": {"in-domain": 103.7, "near-domain…

joeyy5588 updated 2 years ago
6
YuanGongND/ssast #11

Target length

Hi Yuan, Thanks again for this great work, I have been using both this and the original AST model for some downstream tasks. I am currently looking into some other time series data, and was wonder…

kremHabashy updated 1 year ago
2
facebookresearch/GDT #7

VERY SLOW training on audio-video dataset like kinetics400 a…

Hi authors! Thank you for making the paper and code open source. It is very helpful. I am trying to pretrain the GDT model on kinetics400 dataset, while I spent more than 1 day on each epoch. I run …

XinyuSun updated 2 years ago
3
UKPLab/sentence-transformers #1456

How to pretrain mlm and contrastiveLoss at the sametime?

Can i use training_multi-task for pretraining model using mlm task and contrastive loss at the same time? My data are all sentences pair. Looking forward to your reply！

ZzyChris97 updated 2 years ago
1
stefan-it/turkish-bert #38

How is this model bilingual?

Hi Stefan, When I use the Turkish model on an English dataset for classification, it works surprisingly well. So, I have two questions: 1) Does the training corpus contain English texts? 2) I…

smtnkc updated 3 months ago
1
allenai/longformer #207

LongformerForSequenceClassification explanation

Could someone explain to me what exactly this class does? Is it possible to get the classification output without pretraining? (It takes too long on colab GPU. I need something I can run on that)

Nick9214 updated 2 years ago
1
GeneZC/MiniMA #3

Distill Mistral 7B?

Mistral-7b is a much better model (and perhaps a teacher) than Llama-2-7b. Would you kindly release checkpoints for a distilled mistral? Would greatly appreciate it!

ojus1 updated 2 months ago
1
henryzhongsc/adv_robust_gkp #1

Training a network from scratch: not doable

Hi @henryzhongsc, thanks for your work on this repo. I was wondering whether it could be possible to train a different model than a ResNet20 with your current state of the code. I am trying to p…

giorgiopiras updated 4 months ago
1
THUDM/ChatGLM-6B #1309

[Help] 请问chatglm支持像llama一样的continual pretraining吗

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior 有看到llama支持continual pretraining，想问一下chatglm能支持这种继续预训练方式吗？ ### Expected Behavior _No respons…

zoepo updated 11 months ago
1
salesforce/CodeT5 #158

How to train models with contrastive losses, masks, etc.

18liumin updated 1 month ago
1

上一页 1...83 84 85 86 87 88 89...100 下一页

1000+ results for pretraining

1000+ results
for pretraining