-
@cshanbo @lukaszkaiser
Hi, I've read all your discussion in #111 but I don't know your tests' result on segmented Chinese dataset and non-segmented Chinese dataset. I'm using t2t on en-zh translatio…
-
我的是do_train, do_eval, do_predict同时为true的时候的f1正常; 训练结束后想再做predict则f1为0.
我的命令行如下:
bert-base-ner-train --do_train=False --do_eval=False --do_predict=True --data_dir=data1/ --predict_batch_siz=16 --max_…
-
您好,我使用UER的GPT-2预训练方法训练了一个古诗的模型,然后做预测的时候发现生成的就好像是随机的文本,有时甚至还有很多[UNK],想请教下这是为什么?
![2](https://user-images.githubusercontent.com/39848377/113375497-13235600-93a2-11eb-85db-42337d68ab4d.JPEG)
我的输入是“床前明月…
-
hi, when i am running TranslateEnzhWmt8k, when i set the vocabulary size to 8K, the generated chinese vocabulary is normal, but when set it to 32K, the file contains lots of lines like this:
it seems…
yuimo updated
6 years ago
-
thanks for sharing:
in the readme you say "Data: 200m chinese internet question answering pairs. Vocab: 52777, jieba CWS enhanced with forward maximum matching."
so you train process is just pair of…
-
-
就是找不到40001 40002 400xx 这种的vocab,怎么解决呢,是自己生成一个新的词表就行是吗
-
Can you please decribe what should be changed to use non Asian Language?
-
I'm using the processed data you provided to reproduce the results for MathQA.
Following the instructions, I replace the vocab.txt for bert-base-uncased folder. I use train_ft_monolingual-en.sh. Howe…
-
修改了三处地方:
1、model\chinese-bert_chinese_wwm_pytorch\config.json, 其中vocab_size的值改为 30522
2、code\sqlnet\model\sqlbert.py,大约141行附近,增加三行:sel_col_mask = sel_col_mask - 254;where_col_mask = where_col_m…