Unable to check Evaluation results

Hafsa-Masroor commented 4 years ago

I want to test & evaluate performance of this model on GAP data-set. Using its below google colab notebook to avoid machine dependency issues. https://colab.research.google.com/drive/1SlERO9Uc9541qv6yH26LJz5IM9j7YVra#scrollTo=H0xPknceFORt

How can I view the results of evaluation metrics as shown in the mentioned research paper?

Secondly, I tried to run cmd ! GPU=0 python evaluate.py $CHOSEN_MODEL in colab, assuming it would generate the evaluation results, but getting below error:

..
..
..
W0518 20:39:24.163360 140409457641344 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/learning_rate_schedule.py:409: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
W0518 20:39:24.184890 140409457641344 deprecation_wrapper.py:119] From /content/coref/optimization.py:64: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

bert:task 199 27
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gradients_util.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gradients_util.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2020-05-18 20:39:34.733225: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-05-18 20:39:34.735705: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-05-18 20:39:34.735782: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (fcef0d45f3e2): /proc/driver/nvidia/version does not exist
2020-05-18 20:39:34.736239: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-18 20:39:34.750671: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200160000 Hz
2020-05-18 20:39:34.751024: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x78bf9c0 executing computations on platform Host. Devices:
2020-05-18 20:39:34.751074: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
Restoring from ./spanbert_base/model.max.ckpt
2020-05-18 20:39:40.322311: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
W0518 20:39:47.165590 140409457641344 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Traceback (most recent call last):
  File "evaluate.py", line 26, in <module>
    model.evaluate(session, official_stdout=True, eval_mode=True)
  File "/content/coref/independent.py", line 538, in evaluate
    self.load_eval_data()
  File "/content/coref/independent.py", line 532, in load_eval_data
    with open(self.config["eval_path"]) as f:
FileNotFoundError: [Errno 2] No such file or directory: './dev.english.384.jsonlines'

Any idea what could be the possible reason? I am new to this area/environment and following the colab code right now.

Looking for some suggestions / steps to do evaluation part.

Thanks in advance!

Hafsa-Masroor commented 4 years ago

@mandarjoshi90

Hafsa-Masroor commented 4 years ago

UPDATE:

Now using WSL on my machine, ran the above command GPU=0 python evaluate.py spanbert_base & again getting the same error FileNotFoundError: [Errno 2] No such file or directory: './dev.english.384.jsonlines'

So my question is, how can I get this missing jsonlines file to perform evaluation? This file-path is mentioned in experiments.conf but I'n unable to locate the actual file.

I am just using the pre-trained spanbert_base model to test it on my own data. The output data has been generated successfully in another text file after running GPU=0 python predict.py spanbert_base input_data.jsonlines output_data.txt and now I want to evaluate the output results. How can I do so as right now I'm facing the mentioned issue.

@mandarjoshi90 Please advice. Thanks!

mandarjoshi90 commented 4 years ago

That's the processed version of the OntoNotes dev set. You can get OntoNotes from here. https://catalog.ldc.upenn.edu/LDC2013T19

The README has instructions on processing OntoNotes. See the section called Setup for training

If you have your own data that you want to make predictions on, please replace the path with your file (and follow the format in the README).

{
  "clusters": [], # leave this blank
  "doc_key": "nw", # key closest to your domain. "nw" is newswire. See the OntoNotes documentation.
  "sentences": [["[CLS]", "subword1", "##subword1", ".", "[SEP]"]], # list of BERT tokenized segments. Each segment should be less than the max_segment_len in your config
  "speakers": [["[SPL]", "-", "-", "-", "[SPL]"]], # speaker information for each subword in sentences
  "sentence_map": [0, 0, 0, 0, 0], # flat list where each element is the sentence index of the subwords
  "subtoken_map": [0, 0, 0, 1, 1]  # flat list containing original word index for each subword. [CLS]  and the first word share the same index
}

Hafsa-Masroor commented 4 years ago

@mandarjoshi90

The OntoNotes data is not available right now, so I'm just using my own data to test on the pretrained model (without explicitly training it). Is it possible in this case? I'm assuming that the results of my data can be evaluated on your already pre-trained model.

Secondly, what is conll_eval_path and to whom it should be replaced? Because the same error is occurring for this path. The ReadMe says: When running on test, change the eval_path and conll_eval_path from dev to test. But right now I am only having only a single file input_data.jsonlines which has the input data in the required format.

Hafsa-Masroor commented 4 years ago

Hi @ereday No I'm unable to do the evaluation till yet. Since the OntoNotes dataset is not available publicly, so I am using GAP dataset from https://github.com/google-research-datasets/gap-coreference I was successfully able to do the prediction on its gap-test data, but could not perform the evaluation of the model due to the error of missing files I shared above. Kindly let me know if you have any idea or workaround to do that.

Thanks!

mandarjoshi90 commented 4 years ago

Sorry, I'm getting to this only now. GAP uses accuracy for evaluation as opposed to F1 for OntoNotes. If you've been able to make predictions, you can convert them to tsv files and call the gap scorer script.

python to_gap_tsv.py <prediction_jsonline_file> <tsv_output_file> should do the trick.

Hafsa-Masroor commented 4 years ago

Hi @mandarjoshi90 Which file should actually be? OR what data & format does it contain? I'm trying this by replacing with the fileName generated by predict.py <experiment> <input_file> <output_file> and is the gap-test.tsv file from https://github.com/google-research-datasets/gap-coreference that have input sentences. But getting json parsing KeyError: 'nw' , which seems the files I used are somehow incorrect as it does not have the doc_key 'nw'. So please clarify exactly which two files I've to use in to_gap_tsv.py cmd if they are not the mentioned one??

Error for your reference:

Traceback (most recent call last):
  File "to_gap_tsv.py", line 74, in <module>
    convert(json_file, tsv_file)
  File "to_gap_tsv.py", line 48, in convert
    print(list(enumerate(tsv[key])))
KeyError: 'nw'

mandarjoshi90 commented 4 years ago

Sorry, please try: python to_gap_tsv.py <prediction_jsonline_file>

Hafsa-Masroor commented 4 years ago

Hi @mandarjoshi90 Now this is giving me the error :

Traceback (most recent call last):
  File "to_gap_tsv.py", line 74, in <module>
    convert(json_file, tsv_file)
  File "to_gap_tsv.py", line 54, in convert
    pronoun_cluster = find_pronoun_cluster(prediction, prediction['pronoun_subtoken_span'])
KeyError: 'pronoun_subtoken_span'

which clearly indicates that the required json nodes are not present in which I already shared you via email. Seems like either this command OR contents of this jsonfile are incorrect. Mentioning again, I'm using the exactly same generated by GPU=0 python predict.py <experiment> <input_file> <output_file>. So , is it okay or it requires any additional step or amendment (which I highly doubt) ?

Furthermore,

Please look and advice what might be the cause of this error (either file is incorrect or something is missing) and how to fix it?
There is another script gap_to_jsonlines.py in this project, so could you please explain in detail the purpose of both of the below scripts and when/how to use them with gap-coreference project? gap_to_jsonlines.py & to_gap_tsv.py

Looking for your quick response over these couple of queries above. Thanks in advance!

mandarjoshi90 commented 4 years ago

I'm not sure I understand. How did you generate the input file for predict.py? Did you not use gap_to_jsonlines.py? If not, that error makes sense. Here's the full pipeline in case it wasn't clear:

#!/bin/bash
gap_file_prefix=$1
vocab_file=$2
python gap_to_jsonlines.py $gap_file_prefix.tsv $vocab_file
GPU=0 python predict.py bert_base $gap_file_prefix.jsonlines $gap_file_prefix.output.jsonlines
python to_gap_tsv.py $gap_file_prefix.output.jsonlines
python2 ../gap-coreference/gap_scorer.py --gold_tsv $gap_file_prefix.tsv --system_tsv $gap_file_prefix.output.tsv

$1/$gap_file_prefix points to the path of the original GAP file without the tsv prefix.

Hafsa-Masroor commented 4 years ago

@mandarjoshi90 Thanks alot for clarifying these steps. No, I used colab notebook (linked with this project) to generate the input json file instead of gap_to_jsonlines.py. I got this pipeline now and will try again by following this.

liyaoshigehaoren commented 4 years ago

There was an error when I ran this file.“download_pretrained.sh” Could you please send me a copy of the download? Thank you for a long time

Hafsa-Masroor commented 4 years ago

@liyaoshigehaoren What kind of error are you facing in downloading it? An alternate is to directly download the tar file from given url in that file (http://nlp.cs.washington.edu/pair2vec/[model_name].tar.gz ) and then extract it using download_pretrained.sh by removing wget statement. Though data_dir and cmd parameters should be set appropriately.

liyaoshigehaoren commented 4 years ago

When I opened the link, it looked like this, so I couldn't download it，I don't know what this is about

------------------ 原始邮件 ------------------ 发件人: "HM"<notifications@github.com>; 发送时间: 2020年6月28日(星期天) 晚上11:26 收件人: "mandarjoshi90/coref"<coref@noreply.github.com>; 抄送: "青寻子"<528775950@qq.com>;"Mention"<mention@noreply.github.com>; 主题: Re: [mandarjoshi90/coref] Unable to check Evaluation results (#53)

@liyaoshigehaoren What kind of error are you facing in downloading it? An alternate is to directly download the tar file from given url in that file (http://nlp.cs.washington.edu/pair2vec/<model_name>.tar.gz) and process it, though it would require some manual steps.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

liyaoshigehaoren commented 4 years ago

Is your server down? the file "setup_training"，The link in this file cannot be opened. Look at this screenshot.

I don't know what the situation is, can you help me？

------------------ 原始邮件 ------------------ 发件人: "HM"<notifications@github.com>; 发送时间: 2020年6月28日(星期天) 晚上11:26 收件人: "mandarjoshi90/coref"<coref@noreply.github.com>; 抄送: "青寻子"<528775950@qq.com>; "Mention"<mention@noreply.github.com>; 主题: Re: [mandarjoshi90/coref] Unable to check Evaluation results (#53)

@liyaoshigehaoren What kind of error are you facing in downloading it? An alternate is to directly download the tar file from given url in that file (http://nlp.cs.washington.edu/pair2vec/<model_name>.tar.gz) and process it, though it would require some manual steps.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

mandarjoshi90 commented 4 years ago

I don't see a screenshot in your post. But ./download_pretrained.sh <model_name> is working for me. <model_name> here is one of bert_base, spanbert_base, bert_large, spanbert_large.

mandarjoshi90 / coref

Unable to check Evaluation results #53