issues
search
zihangdai
/
xlnet
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Apache License 2.0
6.16k
stars
1.18k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
CPU->GPU Memcpy failed when finetuning with STS-B
#297
xavinatalia
opened
2 months ago
0
Why are activation and dropout added after the classification layer?
#296
MrInouye
closed
5 months ago
0
xlnet, transformer xl attention score funtion problem
#295
wonjunchoi-arc
opened
7 months ago
0
Update data_utils.py
#294
ruxandrastancioi
closed
1 year ago
0
pre-train xlnet for French language
#293
karimmahalian
opened
1 year ago
0
XLnet colab example error .
#292
AlexTrinityBlock
opened
1 year ago
1
【Huawei】2012Lab-Project Cooperation&Exchange Invitation&Job Invitation-Zihang Dai
#291
HanLu1226
opened
1 year ago
0
run error about "InternalError (see above for traceback): Blas xGEMMBatched launch failed : a.shape=[12,512,64], b.shape=[12,64,512], m=512, n=512, k=64, batch_size=12"
#290
ccutyear
opened
1 year ago
2
Tokens and values
#289
Dhurim
opened
2 years ago
0
Update data_utils.py
#288
DLPerf
opened
2 years ago
1
Performance issue in data_utils.py (by P3)
#287
DLPerf
opened
2 years ago
1
Performance issues in the program
#286
DLPerf
opened
2 years ago
0
Performance issue in the program
#285
DLPerf
opened
2 years ago
1
TypeError: Fetch argument None has invalid type <class 'NoneType'> in train_gpu.py
#284
songhee-lee
opened
2 years ago
1
How to get the XLNet vocabulary from spiece.model file and store it to a .vocab file?
#283
SambhawDrag
opened
2 years ago
0
Feature/enhance predictions workflow
#282
agrudkow
closed
3 years ago
0
How to pretrain on multiple GPU?
#281
DHZBill
closed
3 years ago
0
checkpoint_management.py export info
#280
dll1314
opened
3 years ago
0
How are the positional encodings derived
#279
bnicholl
opened
3 years ago
0
specify tf version 1.x
#278
amrzv
opened
3 years ago
0
Why is the first layer of the query stream initialized with the same vector w rather than different vectors?
#277
Huakui-Zhang
opened
3 years ago
0
GPT vs BERT, under same computation and data resource, which one is better for downstream tasks like GLUE?
#276
guotong1988
opened
3 years ago
1
XLNet其实不能稳压RoBERTa吧?
#275
guotong1988
closed
3 years ago
1
What is the function of _sample_mask method?
#274
guotong1988
closed
3 years ago
1
Removing mem-reuse will not decrease the pretraining model performance for short text?
#273
guotong1988
opened
3 years ago
0
The relation of reuse_len and mem_len?
#272
guotong1988
closed
3 years ago
1
reuse_len=0 means no mem? And no benefit for long text but not worse for short text?
#271
guotong1988
closed
3 years ago
1
Problem with generating predictions from fine tuned classification model
#270
abdullahkhilji
opened
3 years ago
0
Multi-gpu slower than single-gpu
#269
weiyx15
opened
3 years ago
1
OOM with least batch 2 in train_gpu.py
#268
eddatt
closed
4 years ago
0
colab notebook can not run under tensorflow 2.0
#267
jlff
opened
4 years ago
0
_split_a_and_b
#266
FruVirus
closed
4 years ago
0
the special tokens of XLNet is different from BERT
#265
lytum
opened
4 years ago
2
get_sequence_output is not contextualized
#264
maziyarpanahi
opened
4 years ago
1
Why the max_seq_length = 512 for XLNet?
#263
vr25
opened
4 years ago
4
Is Next Sentence Prediction implemented in the code ?
#262
GhaliaRehawi
opened
4 years ago
0
How to use your pretrained model for question-answering ? # Question
#261
Alla-Abdella
opened
4 years ago
2
ValueError when running ./gpu_squad_base.sh
#260
Omnis23
opened
4 years ago
3
OOM ERROR when using local batch size=128 on TPUv3-8
#259
GhaliaRehawi
opened
4 years ago
1
Is it possible feed xlnet to seq2seq encoder/decoder NMT (for low resource language)?
#258
JohnasSolomon
opened
4 years ago
0
Can you upload the processor code(run_classifier.py) for glue dataset(cola, qqp, sst-2, rte, mrpc)?
#257
YJYJLee
opened
4 years ago
1
Number of training epochs in original publication
#256
jjedele
opened
4 years ago
0
Docker support
#255
sanjibnarzary
opened
4 years ago
0
[CLS] token / during training process
#254
cherepanovic
opened
4 years ago
0
Is real factorization?
#253
fangwch
opened
4 years ago
0
Python2 to Python3?
#252
hammad26
opened
4 years ago
1
Commands for training and testing on IMDB dataset.
#251
VikasRajashekar
opened
4 years ago
1
Changing Vocab size
#250
yusufani
opened
4 years ago
0
text classification on 3 classes
#249
VikasRajashekar
opened
4 years ago
2
Normalization by NFKC
#248
Ina299
closed
4 years ago
1
Next