microsoft DeBERTa issues

microsoft / DeBERTa

The implementation of DeBERTa

MIT License

1.97k stars 224 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

DeBERTa base different performance numbers

#53 DarshanPatel11 closed 3 years ago
1
where is the absolute position embeddings?

#52 ylwangy opened 3 years ago
0
is this a bug? in disentangled_attention.py pos_query_layer's dimension is 3, when use p2p attention and this code:\n pos_query = pos_query_layer[:,:,att_span:,:] \n get IndexError: too many indices for tensor of dimension 3

#51 hj-github1256 opened 3 years ago
0
missing import of unicodedata package

#50 alisafaya closed 3 years ago
0
How does it save computational cost in EMD?

#49 zhouxincheng opened 3 years ago
0
Penhe/merge ms

#48 BigBird01 closed 3 years ago
0
A question about the implementation of position-to-content attention

#47 SparkJiao closed 3 years ago
0
The exact English pretraining data and Chinese pretraining data that are exact same to the BERT paper's pretraining data.

#46 guotong1988 opened 3 years ago
0
The amount of WSC training examples

#45 slowwavesleep closed 3 years ago
1
A question about the Attention map

#44 14H034160212 closed 3 years ago
1
[ONNX] Add symbolic function so XSoftmax can be exported to ONNX.

#43 fatcat-z closed 3 years ago
1
can't load v1 model

#42 li1117heex opened 3 years ago
2
Enhanced Masked Decoding

#41 joaogui1 closed 3 years ago
1
model key 'encoder.layer.0.attention.self.query_proj.weight' not found in base-mnli

#40 cedar33 opened 3 years ago
1
"deberta-v2-xxlarge"-Model not working!

#39 kinimod23 closed 3 years ago
2
V2 SentencePiece Tokenizer - training settings used?

#38 morganmcg1 opened 3 years ago
0
Update

#37 Dinuda closed 3 years ago
1
Speed of DeBERTa seems disappointing

#36 shenfe closed 2 years ago
1
Pretrained Model with 'RuntimeError: Error(s) in loading state_dict for DebertaModel'

#35 phenylalanin91 opened 3 years ago
0
Dose DeBERTa have any plans to publish Chinese pre-trained model?

#34 cornercf opened 3 years ago
0
RuntimeError: Index tensor must have the same number of dimensions as input tensor

#33 lgstd opened 3 years ago
5
DeBERTa v2: loading example code from huggingface: TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

#32 youssefavx closed 3 years ago
2
Penhe/debertav2

#31 BigBird01 closed 3 years ago
0
Pre-training DeBERTa from Scratch

#30 aabayarea opened 3 years ago
1
Issues loading 1.5B model in huggingface and in deberta package

#29 chessgecko closed 3 years ago
4
Is SiFT included?

#28 applenob closed 2 years ago
1
want some examples for pre-training from scratch

#27 SusannaWull opened 3 years ago
0
bpe_encoder.bin file missing

#26 aabayarea closed 3 years ago
1
Deberta Xlarge

#25 aabayarea closed 3 years ago
2
T5 11B mode

#24 jam-ing closed 3 years ago
1
lunch, did you mean launch?

#23 jam-ing closed 3 years ago
1
Is there Chinese version？

#22 MrRace closed 3 years ago
2
Missing PreLayerNorm code

#21 LiweiPeng closed 3 years ago
1
[bug] incomplete code

#20 shenfe opened 3 years ago
1
[bug-fix] Seems a typo

#19 shenfe closed 3 years ago
0
colab

#18 ak9250 closed 3 years ago
1
Pre-trained models are not accessible

#17 ashissamal closed 3 years ago
3
Typo

#16 nakosung closed 3 years ago
0
Training with K80 or GPUs with version less than 5.x

#15 huberemanuel closed 3 years ago
2
HTTP Error 403: Forbidden when downloading glue_tasks

#14 huberemanuel closed 3 years ago
4
how to evaluate DeBERTa on SUPER_GLUE benchmark?

#13 YoungTimmy opened 3 years ago
0
Is the STS-b fine-tuned model available somewhere to download?

#12 youssefabdelm closed 3 years ago
1
Does any plan to release the pretrain code?

#11 RyanHuangNLP closed 2 years ago
4
Is the Decoder like the Transformer Decoder, or just a layer?

#10 hscspring closed 4 years ago
0
fix logger bug & update readme

#9 namisan closed 4 years ago
0
Update README.md

#8 DomHudson closed 4 years ago
1
Questions about the dataset?

#7 LauraSanchz closed 3 years ago
1
[WIP] ONNX conversion

#6 ganik opened 4 years ago
1
Fix interface of DeBERTa model to ba consistent with HF transformers

#5 BigBird01 closed 4 years ago
0
Is there a plan for Chinese Model?

#4 hscspring closed 4 years ago
0

Previous Next