Open SuMarsss opened 5 years ago
@SuMarsss great to see you have trained Chinese XLNet model and build your own Sentence Piece model
To prepare your label.vocab (which is different from your Sentence Piece control_symbols), you can use the following one,
<pad>
O
X
<cls>
<sep>
B-AnatomyPart
I-AnatomyPart
B-Diagnosis
I-Diagnosis
B-Drug
I-Drug
B-Lab
I-Lab
B-Procedure
I-Procedure
B-Radiology
I-Radiology
And you should also make sure the special_vocab_list
in run_ner.py
align with your Sentence Piece control_symbols,
self.special_vocab_list = ["<unk>", "<s>", "</s>", "<cls>", "<sep>", "<pad>", "<mask>", "<eod>", "<eop>"]
special_vocab_list
When I tried the label.vocal as you said , another error occured.
InvalidArgumentError (see above for traceback): Found Inf or NaN global norm. : Tensor had NaN values [[node VerifyFinite/CheckNumerics (defined at xlnet/model_utils.py:147) ]] [[node replica_1/loss/truediv (defined at run_ner.py:608) ]]
xlnet/model_utils.py:147:
clipped, gnorm = tf.clip_by_global_norm(gradients, FLAGS.clip)
run_ner.py:608:
loss = tf.reduce_sum(cross_entropy * label_mask) / tf.reduce_sum(tf.reduce_max(label_mask, axis=-1))
Looks like gradient exploding issue, could you provide more details (e.g. all vocab list, hyperparam, sentence piece model, etc.) for debugging?
On Wed, Jul 10, 2019 at 12:18 AM SuMarsss notifications@github.com wrote:
special_vocab_list
When I tried the label.vocal as you said , another error occured.
InvalidArgumentError (see above for traceback): Found Inf or NaN global norm. : Tensor had NaN values [[node VerifyFinite/CheckNumerics (defined at xlnet/model_utils.py:147) ]] [[node replica_1/loss/truediv (defined at run_ner.py:608) ]]
xlnet/model_utils.py:147: clipped, gnorm = tf.clip_by_global_norm(gradients, FLAGS.clip)
run_ner.py:608: loss = tf.reduce_sum(cross_entropy * label_mask) / tf.reduce_sum(tf.reduce_max(label_mask, axis=-1))
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stevezheng23/xlnet_extension_tf/issues/31?email_source=notifications&email_token=ABYXYMZTJ5HD363JPI3GJ7LP6WENDA5CNFSM4H7CE2WKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZSRNCQ#issuecomment-509941386, or mute the thread https://github.com/notifications/unsubscribe-auth/ABYXYM5CUCY5BAYUKO4LWFDP6WENDANCNFSM4H7CE2WA .
-- Best, Mingzhi
I have fiix the buged, but I want do output f1_score and precison
@SuMarsss , you can run the following command to get precision/recall/f1 score
python tool/convert_token.py \
--input_file=${OUTPUTDIR}/data/predict.${PREDICTTAG}.json \
--output_file=${OUTPUTDIR}/data/predict.${PREDICTTAG}.txt
python tool/eval_token.py \
< ${OUTPUTDIR}/data/predict.${PREDICTTAG}.txt \
> ${OUTPUTDIR}/data/predict.${PREDICTTAG}.token
Sorry, I thought I have fixed the gradient exploding issue but it occured again.
2019-07-11 10:06:26.659641: E tensorflow/core/kernels/check_numerics_op.cc:185] abnormal_detected_host @0x7f65eb46c500 = {1, 0} Found Inf or NaN global norm.
I think there are some problems with my sentence piece model
or chinese tokenizer
. Here is the result of my tokenized result.
I think the result "缘" "于" is wrong,which splits __ and "缘" and the correct result may be "_缘""_于". Cuz the english tokenized result is "_EU" "_reject".
In the last, I don't konw how to provide details of all vocab list which is a too large txt and sentence piece model which is a binary file. I can only provide detail like this.
sample of all vocab list:
<unk> 0
<s> 0
</s> 0
<cls> 0
<sep> 0
<pad> 0
<mask> 0
<eod> 0
<eop> 0
。 0
, -3.29251
▁ -3.45567
的 -3.76215
1 -4.30766
0 -4.54219
年 -4.64991
2 -4.74569
、 -4.8037
一 -4.90536
在 -4.91364
为 -4.94451
是 -5.03084
中 -5.04317
9 -5.05516
国 -5.06382
) -5.0947
( -5.09492
人 -5.09874
于 -5.26198
@SuMarsss , Yes, I think it should be _于
instead of _
and 于
I never did Chinese sentence piece model training before, maybe you can refer to this post for more insight
Sorry, I thought I have fixed the gradient exploding issue but it occured again.
2019-07-11 10:06:26.659641: E tensorflow/core/kernels/check_numerics_op.cc:185] abnormal_detected_host @0x7f65eb46c500 = {1, 0} Found Inf or NaN global norm.
I think there are some problems with mysentence piece model
or chinesetokenizer
. Here is the result of my tokenized result. I think the result "缘" "于" is wrong,which splits __ and "缘" and the correct result may be "_缘""_于". Cuz the english tokenized result is "_EU" "_reject".In the last, I don't konw how to provide details of all vocab list which is a too large txt and sentence piece model which is a binary file. I can only provide detail like this.
sample of all vocab list:
<unk> 0 <s> 0 </s> 0 <cls> 0 <sep> 0 <pad> 0 <mask> 0 <eod> 0 <eop> 0 。 0 , -3.29251 ▁ -3.45567 的 -3.76215 1 -4.30766 0 -4.54219 年 -4.64991 2 -4.74569 、 -4.8037 一 -4.90536 在 -4.91364 为 -4.94451 是 -5.03084 中 -5.04317 9 -5.05516 国 -5.06382 ) -5.0947 ( -5.09492 人 -5.09874 于 -5.26198
how did you fix this problem?
@charlesXu86 actually I couldn't reproduce this issue, no clue how to resolve it
Sorry, I thought I have fixed the gradient exploding issue but it occured again.
2019-07-11 10:06:26.659641: E tensorflow/core/kernels/check_numerics_op.cc:185] abnormal_detected_host @0x7f65eb46c500 = {1, 0} Found Inf or NaN global norm.
I think there are some problems with mysentence piece model
or chinesetokenizer
. Here is the result of my tokenized result. I think the result "缘" "于" is wrong,which splits __ and "缘" and the correct result may be "_缘""_于". Cuz the english tokenized result is "_EU" "_reject".In the last, I don't konw how to provide details of all vocab list which is a too large txt and sentence piece model which is a binary file. I can only provide detail like this.
sample of all vocab list:
<unk> 0 <s> 0 </s> 0 <cls> 0 <sep> 0 <pad> 0 <mask> 0 <eod> 0 <eop> 0 。 0 , -3.29251 ▁ -3.45567 的 -3.76215 1 -4.30766 0 -4.54219 年 -4.64991 2 -4.74569 、 -4.8037 一 -4.90536 在 -4.91364 为 -4.94451 是 -5.03084 中 -5.04317 9 -5.05516 国 -5.06382 ) -5.0947 ( -5.09492 人 -5.09874 于 -5.26198
this issue that you fix already or not, I got this problem too.
@youbingchenyoubing no fix is applied yet, since I couldn't reproduce this issue. Could you provide more details for your problem?
@youbingchenyoubing no fix is applied yet, since I couldn't reproduce this issue. Could you provide more details for your problem?
File "/home/chenyoubing/virtualplace/xlnet/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1323, in call_without_tpu return self._call_model_fn(features, labels, is_export_mode=is_export_mode) File "/home/chenyoubing/virtualplace/xlnet/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1593, in _call_model_fn estimator_spec = self._model_fn(features=features, *kwargs) File "/home/chenyoubing/nlp/resume_entity/entity_model/build_model/xlnet_model.py", line 135, in model_fn trainop, , _ = model_utils.get_train_op(self.args, loss) File "/home/chenyoubing/nlp/resume_entity/entity_model/xlnet/model_utils.py", line 147, in get_train_op clipped, gnorm = tf.clip_by_global_norm(gradients, FLAGS.clip) File "/home/chenyoubing/virtualplace/xlnet/lib/python3.6/site-packages/tensorflow/python/ops/clip_ops.py", line 271, in clip_by_global_norm "Found Inf or NaN global norm.") File "/home/chenyoubing/virtualplace/xlnet/lib/python3.6/site-packages/tensorflow/python/ops/numerics.py", line 44, in verify_tensor_all_finite return verify_tensor_all_finite_v2(t, msg, name) File "/home/chenyoubing/virtualplace/xlnet/lib/python3.6/site-packages/tensorflow/python/ops/numerics.py", line 62, in verify_tensor_all_finite_v2 verify_input = array_ops.check_numerics(x, message=message) File "/home/chenyoubing/virtualplace/xlnet/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 919, in check_numerics "CheckNumerics", tensor=tensor, message=message, name=name) File "/home/chenyoubing/virtualplace/xlnet/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/home/chenyoubing/virtualplace/xlnet/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(args, **kwargs) File "/home/chenyoubing/virtualplace/xlnet/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/home/chenyoubing/virtualplace/xlnet/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Found Inf or NaN global norm. : Tensor had NaN values [[node VerifyFinite/CheckNumerics (defined at /home/chenyoubing/nlp/resume_entity/entity_model/xlnet/model_utils.py:147) ]]
@youbingchenyoubing Sorry, based on the error message, I can't figure out how run_ner.py
is used by your pipeline. BTW, which dataset does this experiment run with? English or Chinese?
@youbingchenyoubing Sorry, based on the error message, I can't figure out how
run_ner.py
is used by your pipeline. BTW, which dataset does this experiment run with? English or Chinese?
chinese resume ner used in my experiment.
can xlnet support no fixed context?
@SuMarsss / @charlesXu86 / @youbingchenyoubing, sorry, I still can't repro this issue on CoNLL2003 dataset and I think I'll not support Chinese NER in the near future
@SuMarsss / @charlesXu86 / @youbingchenyoubing, sorry, I still can't repro this issue on CoNLL2003 dataset and I think I'll not support Chinese NER in the near future
awsome, thx
I have pretrained xlnet on a large chinese corpus, but how do I run the ner.py and what is label.vocab. Here is my parameters to train the Sentence Piece model
This my pretrained result.
So the
label.vocab
should be like this ?