dolphin-Jia commented 3 years ago

我修改该代码执行SQuAD2.0数据集，报错： WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. INFO:tensorflow:Restoring parameters from /tf/NOC-QA/output_ch/model.ckpt-1166 INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Processing example: 0 INFO:tensorflow:Processing example: 1000 INFO:tensorflow:Processing example: 2000 INFO:tensorflow:Processing example: 3000 INFO:tensorflow:Processing example: 4000 INFO:tensorflow:Processing example: 5000 INFO:tensorflow:prediction_loop marked as finished INFO:tensorflow:prediction_loop marked as finished INFO:tensorflow:Writing predictions to: /tf/NOC-QA/output_ch/dev_predictions.json INFO:tensorflow:Writing nbest to: /tf/NOC-QA/output_ch/dev_nbest_predictions.json Traceback (most recent call last): File "/tf/NOC-QA/baseline/run_cmrc2018_drcd_baseline.py", line 1448, in tf.app.run() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "/tf/NOC-QA/baseline/run_cmrc2018_drcd_baseline.py", line 1377, in main output_nbest_file, output_null_log_odds_file) File "/tf/NOC-QA/baseline/run_cmrc2018_drcd_baseline.py", line 962, in write_predictions end_logit=null_end_logit)) # In very rare edge cases we could have no valid predictions. So we TypeError: new() missing 2 required positional arguments: 'start_index' and 'end_index'

ymcui commented 3 years ago

已修复，是因为nbest没有满足条件的答案。对应代码已加入start_index和end_index字段（默认为0）。 https://github.com/ymcui/cmrc2018/blob/master/baseline/run_cmrc2018_drcd_baseline.py#L900

dolphin-Jia commented 3 years ago

已修复，是因为nbest没有满足条件的答案。对应代码已加入start_index和end_index字段（默认为0）。 https://github.com/ymcui/cmrc2018/blob/master/baseline/run_cmrc2018_drcd_baseline.py#L900

非常感谢您的回复，所以您修改过后的代码直接就能适用用SQuAD2.0数据集了吗？还是也需要再修改？

dolphin-Jia commented 3 years ago

另外，对于中文文本和英文文本混合的情况，您是否有测试过模型的效果呢？我这边运行是ok的，但不确定该baseline是否适用于混合语言的情况，向您请教，感谢。

ymcui commented 3 years ago

只跑SQuAD 2.0的话，建议使用bert原版代码：https://github.com/google-research/bert/blob/master/run_squad.py
中文版BERT词表中包含一些常见英文单词，这里的代码是可以支持中英混合数据的。

dolphin-Jia commented 3 years ago

只跑SQuAD 2.0的话，建议使用bert原版代码：https://github.com/google-research/bert/blob/master/run_squad.py

中文版BERT词表中包含一些常见英文单词，这里的代码是可以支持中英混合数据的。

根据您的第2点回答，若自行扩充vocab.txt，是否就可以更好的支持中英文混合数据。但因为数据的格式是同SQuAD2.0，所以要修改您的代码以适应；另外如果我想要使用例如BERT-wwm-ext作为预训练模型，该预训练模型是否可以很好的支持中英文混合数据，还是说建议只用在中文数据中？

ymcui commented 3 years ago

少量的中英混合是没有问题的，因为本身中文预训练语料中也会存在一定的英文表述。如果你要处理的文本中英文占比不大就没有关系。

dolphin-Jia commented 3 years ago

nbest.append( _NbestPrediction( text=final_text, start_logit=pred.start_logit, end_logit=pred.end_logit, start_index=pred.start_index, end_index=pred.end_index))

# just create a nonce prediction in this case to avoid failure.
if not nbest:
  nbest.append(
      _NbestPrediction(text="empty", start_logit=0.0, end_logit=0.0, start_index=0, end_index=0))

if we didn't inlude the empty option in the n-best, inlcude it

if FLAGS.version_2_with_negative:
  if "" not in seen_predictions:
    nbest.append(
        _NbestPrediction(
            text="", start_logit=null_start_logit,
            end_logit=null_end_logit, start_index=0, end_index=0))    # In very rare edge cases we could have no valid predictions. So we

assert len(nbest) >= 1

total_scores = []
best_non_null_entry = None
for entry in nbest:
  total_scores.append(entry.start_logit + entry.end_logit)
  if not best_non_null_entry:
    if entry.text:
      best_non_null_entry = entry

probs = _compute_softmax(total_scores)

nbest_json = []

您好，我在代码中做了如下修改以使得代码适用于version2格式（有不可回答的问题），为何仍然会报错： new() missing 2 required positional arguments: 'start_index' and 'end_index'，具体修改的部分如下：

if we didn't inlude the empty option in the n-best, inlcude it

if FLAGS.version_2_with_negative:
  if "" not in seen_predictions:
    nbest.append(
        _NbestPrediction(
            text="", start_logit=null_start_logit,
            end_logit=null_end_logit, start_index=0, end_index=0))

ymcui commented 3 years ago

哪行报错误？

dolphin-Jia commented 3 years ago

Traceback (most recent call last): File "/tf/NOC-QA/baseline/run_cmrc2018_drcd_baseline.py", line 1449, in flags.mark_flag_as_required("bert_config_file") File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "/tf/NOC-QA/baseline/run_cmrc2018_drcd_baseline.py", line 1378, in main FLAGS.n_best_size, FLAGS.max_answer_length, File "/tf/NOC-QA/baseline/run_cmrc2018_drcd_baseline.py", line 963, in write_predictions if FLAGS.version_2_with_negative: TypeError: new() missing 2 required positional arguments: 'start_index' and 'end_index'

ymcui commented 3 years ago

比较奇怪，_NbestPrediction 定义里有这两个argument吗？谷歌原版run_squad.py里是没有的。你就围绕这两个参数调一调吧，或者这一块直接不处理，最后判断没有答案就写一个空字符串，试试会不会报错。

dolphin-Jia commented 3 years ago

谢谢您~我是在您代码的基础上修改的，所以我保留了您代码中的start_index和end_index，我试试看不使用这两个agrument是否可行？当初您是什么原因要加入这两个argument呢？是否有必要呢

ymcui commented 3 years ago

可不用这两个index信息，当时是为了把index信息写入文件所以留下的这两个字段。如果你不用的话可以把后续涉及到这两个index的代码都删掉。

dolphin-Jia commented 3 years ago

可不用这两个index信息，当时是为了把index信息写入文件所以留下的这两个字段。如果你不用的话可以把后续涉及到这两个index的代码都删掉。

谢谢，我删掉了，能够正常运行。对了，请问MacBERT-large-extData-v2这个模型你们有开源了吗，目前再cmrc2018排在榜首（EM:80.409 | F1:93.768）。我在https://github.com/ymcui/MacBERT没有找到这个

ymcui commented 3 years ago

仔细看榜单上的机构，并不是我们的submission。

dolphin-Jia commented 3 years ago

您好，有三个问题向您请教： 1、我在训练的过程中发现自定义的分词似乎有些问题，例如在最终的预测中会把"2020-09-26 00:00-06:00"预测成"2020-09-2600:00-06:00",这个问题您在测试中有遇到过吗？ 2、我的数据集中有一个问题多个答案的情况，我在训练时，对于这种情况，我把数据整理成多提几次问题，分别回答对应的答案，做到相同问题和不同答案一一对应，但是在训练后验证时，多个问题也都回答了一个相同的答案，这种情况有什么办法解决方法吗？ 3、您的evaluate.py是适用于SQuAD 1.0的，那我想做SQuAD 2.0，则要修改F1 score的计算方法，能分享一下修改思路吗？

dolphin-Jia commented 3 years ago

您好，有三个问题向您请教： 1、我在训练的过程中发现自定义的分词似乎有些问题，例如在最终的预测中会把"2020-09-26 00:00-06:00"预测成"2020-09-2600:00-06:00",这个问题您在测试中有遇到过吗？ 2、我的数据集中有一个问题多个答案的情况，我在训练时，对于这种情况，我把数据整理成多提几次问题，分别回答对应的答案，做到相同问题和不同答案一一对应，但是在训练后验证时，多个问题也都回答了一个相同的答案，这种情况有什么办法解决方法吗？ 3、您的evaluate.py是适用于SQuAD 1.0的，那我想做SQuAD 2.0，则要修改F1 score的计算方法，能分享一下修改思路吗？

dolphin-Jia commented 3 years ago

您好，有三个问题向您请教： 1、我在训练的过程中发现自定义的分词似乎有些问题，例如在最终的预测中会把"2020-09-26 00:00-06:00"预测成"2020-09-2600:00-06:00",这个问题您在测试中有遇到过吗？ 2、我的数据集中有一个问题多个答案的情况，我在训练时，对于这种情况，我把数据整理成多提几次问题，分别回答对应的答案，做到相同问题和不同答案一一对应，但是在训练后验证时，多个问题也都回答了一个相同的答案，这种情况有什么办法解决方法吗？ 3、您的evaluate.py是适用于SQuAD 1.0的，那我想做SQuAD 2.0，则要修改F1 score的计算方法，能分享一下修改思路吗？

ymcui / cmrc2018

new() missing 2 required positional arguments: 'start_index' and 'end_index' #12

if we didn't inlude the empty option in the n-best, inlcude it

if we didn't inlude the empty option in the n-best, inlcude it

ymcui / cmrc2018

__new__() missing 2 required positional arguments: 'start_index' and 'end_index' #12

if we didn't inlude the empty option in the n-best, inlcude it

if we didn't inlude the empty option in the n-best, inlcude it

new() missing 2 required positional arguments: 'start_index' and 'end_index' #12