pretraining Search Results

1000+ results
for pretraining

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/swift-coreml-transformers #2

Any plan for pretraining?

I'm curious whether there's any plan to support pretrianing models from scratch?

jbmaxwell updated 5 years ago
4
snap-stanford/prodigy #4

Baseline issue

When re-producing the experiment of pretraining on mag240m and evaluating on arxiv, we found that the Contrastive baseline results in the similar performance as Prodigy when the aux loss is applied (u…

uebian updated 2 weeks ago
2
OpenGVLab/Ask-Anything #221

Question ; Why VideoChat2 divided into 3 stages for training…

Hi, I am an undergraduate student, studying this repository. I have several questions. It is noted that stage 1 needs 8 GPUs, and stage2 needs 4 GPUs. But It seems that stage2 has more extended ar…

easyminnn updated 3 weeks ago
1
google-research/vision_transformer #62

Inquiry on pretraining acc

Thanks for your excellent work. Would you mind me asking what is the pretraining acc on imagenet2012 that then used for finetuning?

yyou1996 updated 2 years ago
22
Bonifatius94/ChessAI.Py #7

Implement Pretraining Rating LinReg

**Suggestions:** - make the pretraining on GM games data work - don't try to achieve too much at one: - reduce the training dataset to a minimum - reduce the model to a minimum - normalize in…

Bonifatius94 updated 3 years ago
1
haotian-liu/LLaVA #1548

[Usage] Llava-Gemma Pretraining + Fine tuning Usage issue an…

### Describe the issue Issue: I first pretrained the projector using Clip + Gemma Model and then FIne tuned the Gemma and Projector, but no matter what It is giving in correct outputs, and the loss …

nlpkiddo-2001 updated 1 week ago
1
facebookresearch/fairseq #3226

slow Roberta pretraining from scratch

Hello, I am pretraining Roberta from scratch on 64x16GB RAM GPUs on 330 GB (split into 128 partitions) of text but currently, at epoch 32 and pertaining seems to be very slow. Is this behavior no…

aabayarea updated 3 years ago
3
SLR1999/Human-Aware-Motion-Deblurring #1

Question to attention network pretraining

Thx for your work.The paper mentioned that attention network is pretrained 70000 iteration for convergence,could you please tell me how to do that?

GangMieYe updated 3 years ago
3
VITA-Group/DeblurGANv2 #155

Missing Discriminator Weights for pretraining

I try to fine-tune the available InceptionResNet-v2 weights. But only generator weights are available. Is there a way you can provide the full generator and discriminator weights, which can be directl…

sushant097 updated 1 year ago
1
microsoft/DeepSpeed #1777

Initialization/warmstarting for RoBERTa pretraining

I am currently trying to further pretrain a RoBERTa model on a custom dataset, initializing the model with `roberta-base` weights. I am using [this script](https://github.com/manueltonneau/academic-bu…

manueltonneau updated 2 years ago
3

上一页 1...24 25 26 27 28 29 30...100 下一页

1000+ results for pretraining

1000+ results
for pretraining