dhlee347 pytorchic-bert issues

dhlee347 / pytorchic-bert

Pytorch Implementation of Google BERT

Apache License 2.0

589 stars 181 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

How much the token-level MLM loss usually is when the bert pre-training stops converging?

#34 MingLunHan closed 2 years ago
0
Is GEGLU innovative, or is it derived from a certain paper?

#33 takfate closed 3 years ago
0
some confusions

#32 leileilin opened 3 years ago
0
Visualizing the attention weights

#31 ahof1704 opened 3 years ago
0
Why is there any need of max_pred in pretraining?

#30 wahab4114 opened 3 years ago
0
Does this support multi GPU training?

#29 abhisheksgumadi closed 4 years ago
2
Running SQUAD

#28 ismaeel123 closed 4 years ago
3
More pytorchic using nn.GELU module

#27 guglie opened 4 years ago
0
How can I get the replacement of 'books_large_all.txt'?

#26 dodoyeon opened 4 years ago
0
Usage

#25 JingsenZhang opened 4 years ago
1
Masked subword prediction problem

#24 akakakakakaa opened 4 years ago
0
Question about running the pretrain.py

#23 littleflow3r opened 4 years ago
2
update optim.py

#22 zihangJiang opened 4 years ago
0
questions for loading the pretrained_model

#21 mingbocui opened 4 years ago
2
the total number of trainable parameters in 12 layer BERT

#20 mingbocui closed 4 years ago
1
Only 80% real masks, 10% random vocabs in n-gram MLM

#19 graykode closed 4 years ago
0
Can you please provide books_large_all.txt? And also, the pretrained model uncased_L-12_H-768_A-12/bert_model.ckpt?

#18 AyanKumarBhunia closed 4 years ago
1
Can you please provide books_large_all.txt?

#17 AyanKumarBhunia closed 4 years ago
1
Remove duplicated lines

#16 kiddj closed 4 years ago
0
Pretraining with checkpoints

#15 abhi060698 closed 5 years ago
1
Revert "edit segment indices and embedding numbers for padding"

#14 dhlee347 closed 5 years ago
0
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 3793: ordinal not in range(128)

#13 likerainsun closed 4 years ago
1
How can we use in on test dataset?

#12 GeetDsa closed 5 years ago
0
edit segment indices and embedding numbers for padding

#11 AppleHolic closed 5 years ago
1
Padding bugs on data preprocess

#10 AppleHolic closed 5 years ago
2
Question About fine-tuning

#9 graykode closed 5 years ago
1
pretrain for chinese text

#8 Jason-kid closed 5 years ago
1
any sample dataset for pre-training?

#7 SeekPoint closed 5 years ago
1
Can you give me some details about files?

#6 hufflepoohpooh closed 5 years ago
1
h = (scores @ v).transpose(1, 2).contiguous() RuntimeError: CUDA error: out of memory

#5 leerelive closed 5 years ago
2
Nice work!

#4 thomwolf closed 4 years ago
3
MIT license @ Benjamin Dong-Hyun Lee

#3 theSage21 closed 5 years ago
0
Could you add a license file?

#2 theSage21 closed 5 years ago
1
Pretraining data format and possible corner case of seek_random_offset()

#1 L0SG closed 5 years ago
2