microsoft DeBERTa issues

microsoft / DeBERTa

The implementation of DeBERTa

MIT License

1.97k stars 224 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

RTD is not registed

#154 Eric-Chen-007 closed 3 weeks ago
1
Pretraining the deberta-v3 by larger context length.

#153 sherlcok314159 opened 2 months ago
2
Generator weights

#152 ir2718 opened 5 months ago
0
Fine-tune DeBERTa v3 language model, worthwhile endeavour?

#151 shensmobile opened 7 months ago
5
How can I evaluate COPA dataset?

#150 KwanghyeonLee opened 8 months ago
0
Reason for missing values in table for the Roberta-base, mrpc entry

#149 Aradhye2002 opened 9 months ago
0
No assert: Training does not start when using different tokenizer/ tokenized-data

#148 adriwitek opened 9 months ago
0
Inference gives different results when using multiple gpus (distributed mode) vs just one gpu (not distributed mode)

#147 ThuongTNguyen opened 9 months ago
0
Model is not initialized correctly when path to a pretrained model is provided via `pre_trained`

#146 ThuongTNguyen opened 9 months ago
0
Question regarding symmetric KL Loss

#145 skbaur opened 10 months ago
0
sklearn changed to scikit-learn to avoid pip install failure

#144 RajkumarGalaxy opened 1 year ago
0
Fix typo

#143 mrm8488 opened 1 year ago
0
fix: Corrected code instructions in README.md

#142 zzc0430 opened 1 year ago
0
fix: sklearn -> scikit-learn

#141 zzc0430 opened 1 year ago
0
Trying to initialize model "large"

#140 Saivaks opened 1 year ago
0
EOF error while running the rtd.sh script

#139 BartWesthoff opened 1 year ago
1
change: reduce memory consumption

#138 YongWookHa opened 1 year ago
2
Trying to run rtd_task.py on Windows

#137 Yuri-Albuquerque opened 1 year ago
1
When calculating Qr, why is the W of content used instead of the W of position used?

#136 nebula303 opened 1 year ago
0
Eligibility for Commercial Use

#135 Hegelim closed 1 year ago
1
Install fails due to use of deprecated `sklearn` package

#134 benfogelson opened 1 year ago
0
Change from deprecated package name sklearn to correct scikit-learn

#133 benfogelson opened 1 year ago
0
Load deberta-v3-large but got deberta-v2 model

#132 ChengsongLu opened 1 year ago
2
Deberta-v3-base Generator model

#131 sharanyarc96 opened 1 year ago
2
n/a

#130 StephennFernandes closed 1 year ago
0
AssertionError: RTD is not registed.

#129 StephennFernandes closed 1 year ago
3
No module named 'torch._six'

#128 StephennFernandes closed 1 year ago
2
Error when running the example code for pretraining the rtd model.

#127 soonilbae opened 1 year ago
15
effectiveness of RTD

#126 martin-reczko opened 1 year ago
0
Info on Deberta-v2-xlarge training infra

#125 karthickgopalswamy opened 1 year ago
0
Microsoft

#124 omniteams opened 1 year ago
0
mDeBERTa Generator model

#123 dadelani closed 1 year ago
3
1. Add code for DeBERTaV3 pre-training; 2. Fix error in torch 1.11; 3…

#122 BigBird01 closed 1 year ago
0
Generator Model

#121 prajwal967 closed 1 year ago
1
Convert DeBERTa model to ONNX with mixed precision

#120 SergeyShk opened 1 year ago
0
which version is torch ?

#119 XuJianzhi closed 1 year ago
0
Fix: a few typos as I read through the README.md

#118 cpcdoy closed 1 year ago
0
why vocab.txt and tokenizer.json not in pretrained model in huggingface ??

#117 XuJianzhi opened 1 year ago
1
Code about deberta_v3

#116 BAOOOOOM closed 1 year ago
1
AssertionError: [] in google coab

#115 yupesh opened 1 year ago
0
Can you upload the code finetuned in SQuad 2.0? Thank you very much.

#114 junzai0215 opened 2 years ago
0
mDeBERTa large

#113 djstrong opened 2 years ago
0
Can you tell me which token represents the overall representation of the sentence in the task of feature-extraction? The first token or the last token?

#112 junzai0215 opened 2 years ago
0
Where is the Gradient-Disentangled Embedding Sharing(GDES) part in the code?

#111 Cakeyan closed 1 year ago
3
Can't run bash commands in /DeBERTa/experiments/glue/

#110 heya5 closed 2 years ago
0
out of memory

#109 Amazing-J opened 2 years ago
18
How to pretrain DeBERTa v3 ??

#108 BinhMinhs10 closed 1 year ago
2
Why does the size of DeBERTaV3 double on disk after finetuning?

#106 nadahlberg closed 2 years ago
2
where is ENHANCED MASK DECODER ACCOUNTS part in code?

#105 tjshu closed 2 years ago
1
Evaluation hangs for distributed MLM task

#104 dannyel2511 opened 2 years ago
7