issues
search
microsoft
/
DeBERTa
The implementation of DeBERTa
MIT License
1.97k
stars
224
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
DeBERTa base different performance numbers
#53
DarshanPatel11
closed
3 years ago
1
where is the absolute position embeddings?
#52
ylwangy
opened
3 years ago
0
is this a bug? in disentangled_attention.py pos_query_layer's dimension is 3, when use p2p attention and this code:\n pos_query = pos_query_layer[:,:,att_span:,:] \n get IndexError: too many indices for tensor of dimension 3
#51
hj-github1256
opened
3 years ago
0
missing import of unicodedata package
#50
alisafaya
closed
3 years ago
0
How does it save computational cost in EMD?
#49
zhouxincheng
opened
3 years ago
0
Penhe/merge ms
#48
BigBird01
closed
3 years ago
0
A question about the implementation of position-to-content attention
#47
SparkJiao
closed
3 years ago
0
The exact English pretraining data and Chinese pretraining data that are exact same to the BERT paper's pretraining data.
#46
guotong1988
opened
3 years ago
0
The amount of WSC training examples
#45
slowwavesleep
closed
3 years ago
1
A question about the Attention map
#44
14H034160212
closed
3 years ago
1
[ONNX] Add symbolic function so XSoftmax can be exported to ONNX.
#43
fatcat-z
closed
3 years ago
1
can't load v1 model
#42
li1117heex
opened
3 years ago
2
Enhanced Masked Decoding
#41
joaogui1
closed
3 years ago
1
model key 'encoder.layer.0.attention.self.query_proj.weight' not found in base-mnli
#40
cedar33
opened
3 years ago
1
"deberta-v2-xxlarge"-Model not working!
#39
kinimod23
closed
3 years ago
2
V2 SentencePiece Tokenizer - training settings used?
#38
morganmcg1
opened
3 years ago
0
Update
#37
Dinuda
closed
3 years ago
1
Speed of DeBERTa seems disappointing
#36
shenfe
closed
2 years ago
1
Pretrained Model with 'RuntimeError: Error(s) in loading state_dict for DebertaModel'
#35
phenylalanin91
opened
3 years ago
0
Dose DeBERTa have any plans to publish Chinese pre-trained model?
#34
cornercf
opened
3 years ago
0
RuntimeError: Index tensor must have the same number of dimensions as input tensor
#33
lgstd
opened
3 years ago
5
DeBERTa v2: loading example code from huggingface: TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
#32
youssefavx
closed
3 years ago
2
Penhe/debertav2
#31
BigBird01
closed
3 years ago
0
Pre-training DeBERTa from Scratch
#30
aabayarea
opened
3 years ago
1
Issues loading 1.5B model in huggingface and in deberta package
#29
chessgecko
closed
3 years ago
4
Is SiFT included?
#28
applenob
closed
2 years ago
1
want some examples for pre-training from scratch
#27
SusannaWull
opened
3 years ago
0
bpe_encoder.bin file missing
#26
aabayarea
closed
3 years ago
1
Deberta Xlarge
#25
aabayarea
closed
3 years ago
2
T5 11B mode
#24
jam-ing
closed
3 years ago
1
lunch, did you mean launch?
#23
jam-ing
closed
3 years ago
1
Is there Chinese version?
#22
MrRace
closed
3 years ago
2
Missing PreLayerNorm code
#21
LiweiPeng
closed
3 years ago
1
[bug] incomplete code
#20
shenfe
opened
3 years ago
1
[bug-fix] Seems a typo
#19
shenfe
closed
3 years ago
0
colab
#18
ak9250
closed
3 years ago
1
Pre-trained models are not accessible
#17
ashissamal
closed
3 years ago
3
Typo
#16
nakosung
closed
3 years ago
0
Training with K80 or GPUs with version less than 5.x
#15
huberemanuel
closed
3 years ago
2
HTTP Error 403: Forbidden when downloading glue_tasks
#14
huberemanuel
closed
3 years ago
4
how to evaluate DeBERTa on SUPER_GLUE benchmark?
#13
YoungTimmy
opened
3 years ago
0
Is the STS-b fine-tuned model available somewhere to download?
#12
youssefabdelm
closed
3 years ago
1
Does any plan to release the pretrain code?
#11
RyanHuangNLP
closed
2 years ago
4
Is the Decoder like the Transformer Decoder, or just a layer?
#10
hscspring
closed
4 years ago
0
fix logger bug & update readme
#9
namisan
closed
4 years ago
0
Update README.md
#8
DomHudson
closed
4 years ago
1
Questions about the dataset?
#7
LauraSanchz
closed
3 years ago
1
[WIP] ONNX conversion
#6
ganik
opened
4 years ago
1
Fix interface of DeBERTa model to ba consistent with HF transformers
#5
BigBird01
closed
4 years ago
0
Is there a plan for Chinese Model?
#4
hscspring
closed
4 years ago
0
Previous
Next