pretraining Search Results

1000+ results
for pretraining

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ammesatyajit/VideoBERT #3

time for pretraining

I have to ask that do you have any figures how much this takes for pretraining and on which GPU and which dataset_ ? please if you can guide.

Tortoise17 updated 3 years ago
4
jzhang38/TinyLlama #165

Reference for pretraining other small language models

The README mentions this codebase can act as a "reference for enthusiasts keen on pretraining language models under 5 billion parameters". I'm wondering if you could give a brief guide on how to do so…

kmn1024 updated 4 months ago
1
ivadomed/model-seg-dcm #3

Pretrain on `dcm-zurich` for compression detection and finet…

This issue intends to compare performances between the model trained from scratch on `dcm-zurich-lesions-*` (#1) vs. a model pretrained on `dcm-zurich` for detecting compression sites and using those …

naga-karthik updated 4 months ago
5
google-research/t5x #1093

Pretraining Initialization and Loss

When trying to pretrain t5-base, we are seeing that that pretraining loss starts at an enormous number (~160000). Even when trying to pretrain smaller variants of t5, the initial pretraining loss alw…

dragon18456 updated 10 months ago
1
chaoyi-wu/RadFM #34

the pretrain csv data for PMC-Inline

Hello, thanks for your contributory work! I find that there isn't a `paper_train.csv` in the `data_csv.zip`. Is the paper path in this csv file the same as the PMC-Inline text json file from your hugg…

zihui-debug updated 1 week ago
2
thuml/Universal-Domain-Adaptation #9

Pretraining on ImageNet

You had mentioned that the backbone network is ResNet-50 pretrained on Imagenet. https://github.com/thuml/Universal-Domain-Adaptation/blob/5d7caa95af7e3675305c542253c4e372801897d2/net.py#L37 Bu…

tarun005 updated 3 years ago
1
facebookresearch/fairseq #4742

Help with replicating the results for Hubert Pretraining

## ❓ Questions and Help #### What is your question? I am trying to replicate the HuBERT base pretraining iter1 on librispeech 960hr. However, the training curve seems to be weird, as the unmask co…

a43992899 updated 3 months ago
2
THUDM/CodeGeeX #108

Pretraining performance on Megatron

Hi all! Since the initial training is based on Mindspore, I'm wondering if there is any training result for the first stage on the Megatron.

Amandaynzhou updated 1 year ago
1
tossyi/paper-reading #4

[2020]Parsing as Pretraining

## Paper Link https://arxiv.org/abs/2002.01685 https://github.com/aghie/parsing-as-pretraining ## Upload 2020/2/5 ## What is paper about? ## Paper Contributions ## Key Points ## Va…

tossyi updated 2 years ago
2
google-research/text-to-text-transfer-transformer #771

clarification on T5 pretraining

Hi, I am trying to reproduce pretraining of mt5 model, when you modify the sentences as: `Thank you to week => for inviting me your party last ` Then do you compute the loss on all to…

dorost1234 updated 3 years ago
1

上一页 1...16 17 18 19 20 21 22...100 下一页

1000+ results for pretraining

1000+ results
for pretraining