issues
search
dhlee347
/
pytorchic-bert
Pytorch Implementation of Google BERT
Apache License 2.0
589
stars
181
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
How much the token-level MLM loss usually is when the bert pre-training stops converging?
#34
MingLunHan
closed
2 years ago
0
Is GEGLU innovative, or is it derived from a certain paper?
#33
takfate
closed
3 years ago
0
some confusions
#32
leileilin
opened
3 years ago
0
Visualizing the attention weights
#31
ahof1704
opened
3 years ago
0
Why is there any need of max_pred in pretraining?
#30
wahab4114
opened
3 years ago
0
Does this support multi GPU training?
#29
abhisheksgumadi
closed
4 years ago
2
Running SQUAD
#28
ismaeel123
closed
4 years ago
3
More pytorchic using nn.GELU module
#27
guglie
opened
4 years ago
0
How can I get the replacement of 'books_large_all.txt'?
#26
dodoyeon
opened
4 years ago
0
Usage
#25
JingsenZhang
opened
4 years ago
1
Masked subword prediction problem
#24
akakakakakaa
opened
4 years ago
0
Question about running the pretrain.py
#23
littleflow3r
opened
4 years ago
2
update optim.py
#22
zihangJiang
opened
4 years ago
0
questions for loading the pretrained_model
#21
mingbocui
opened
4 years ago
2
the total number of trainable parameters in 12 layer BERT
#20
mingbocui
closed
4 years ago
1
Only 80% real masks, 10% random vocabs in n-gram MLM
#19
graykode
closed
4 years ago
0
Can you please provide books_large_all.txt? And also, the pretrained model uncased_L-12_H-768_A-12/bert_model.ckpt?
#18
AyanKumarBhunia
closed
4 years ago
1
Can you please provide books_large_all.txt?
#17
AyanKumarBhunia
closed
4 years ago
1
Remove duplicated lines
#16
kiddj
closed
4 years ago
0
Pretraining with checkpoints
#15
abhi060698
closed
5 years ago
1
Revert "edit segment indices and embedding numbers for padding"
#14
dhlee347
closed
5 years ago
0
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 3793: ordinal not in range(128)
#13
likerainsun
closed
4 years ago
1
How can we use in on test dataset?
#12
GeetDsa
closed
5 years ago
0
edit segment indices and embedding numbers for padding
#11
AppleHolic
closed
5 years ago
1
Padding bugs on data preprocess
#10
AppleHolic
closed
5 years ago
2
Question About fine-tuning
#9
graykode
closed
5 years ago
1
pretrain for chinese text
#8
Jason-kid
closed
5 years ago
1
any sample dataset for pre-training?
#7
SeekPoint
closed
5 years ago
1
Can you give me some details about files?
#6
hufflepoohpooh
closed
5 years ago
1
h = (scores @ v).transpose(1, 2).contiguous() RuntimeError: CUDA error: out of memory
#5
leerelive
closed
5 years ago
2
Nice work!
#4
thomwolf
closed
4 years ago
3
MIT license @ Benjamin Dong-Hyun Lee
#3
theSage21
closed
5 years ago
0
Could you add a license file?
#2
theSage21
closed
5 years ago
1
Pretraining data format and possible corner case of seek_random_offset()
#1
L0SG
closed
5 years ago
2