issues
search
kwonmha
/
bert-vocab-builder
Builds wordpiece(subword) vocabulary compatible for Google Research's BERT
226
stars
47
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
I am getting an error while running vocab builder.
#16
mshivasharan
opened
3 years ago
2
BERT trained on custom corpus
#15
anidiatm41
opened
4 years ago
1
splitting strategy in tokenize.py
#14
mandalbiswadip
closed
4 years ago
3
Corpus preprocessing steps
#13
LydiaXiaohongLi
opened
4 years ago
5
is there a format for corpus_filepattern?
#12
YuBeomGon
closed
4 years ago
2
error in running ALBERT create_pretraining_data.py
#11
aravindchaluvadi
closed
4 years ago
2
Windows fatal exception: access violation
#10
frank-lin-liu
closed
4 years ago
0
Issue with tf.gfile / tf.io.gfile
#9
then4p
closed
4 years ago
1
AttributeError: module 'tensorflow.io' has no attribute 'gfile'
#8
AmeeraMilibari
closed
5 years ago
2
Projects using this and evaluation results
#7
NebelAI
opened
5 years ago
2
Should I match the vocabulary size with bert_config.json
#6
AnakTeka
closed
5 years ago
2
Not accurate sub-words for German
#5
maggieezzat
opened
5 years ago
1
Removed merge conflict markers
#4
bhoomit
closed
5 years ago
1
Merge conflict markers are still there…
#3
bhoomit
closed
5 years ago
2
Unable to understand the input format and also the generated output
#2
ayushjain1144
closed
5 years ago
4
If I change the min_count flag in order to produce vocab of size same bert's original vocab: can I then use this new vocab to pretrain from a checkpoint, or I have to train from the scratch?
#1
maggieezzat
closed
5 years ago
8