lavis-nlp / jerex

PyTorch code for JEREX: Joint Entity-Level Relation Extractor
MIT License
63 stars 15 forks source link

What's the best way to use the joint model to infer entities, and relations among them, in a fresh new phrase? #7

Closed raphael10-collab closed 3 years ago

raphael10-collab commented 3 years ago

What's the best way to use the joint model to infer entities, and relations among them, in a fresh new phrase, for example for this phrase:

 " I-B is one of the sub-committees that advises the Scientific Advisory Group for Emergencies (Sage), led by Sir Patrick 
Vallance, the chief scientific adviser"

?

markus-eberts commented 3 years ago

Currently, there is no dedicated inference mode for JEREX. If you just want to detect entity mentions and relations in sentences, you could use SpERT, which has a built-in inference mode. For JEREX, there is a workaround: You can just use the test mode ('python ./jerex_test.py --config-path ...') instead. The predictions are then saved to disk. Of course you need to tokenize your sentences first (e.g. with SpaCy) and then convert your tokenized sentences to the DocRED JSON input format (e.g. see 'test_joint.json', just leave the 'vertexSet' and 'labels' list empty).

raphael10-collab commented 3 years ago

@markus-eberts
Based on this template https://github.com/thunlp/DocRED/tree/master/data I wrote thisfromSentenceToDocREDinputFormat.py file :

import json
import spacy
from spacy.lang.en import English
nlp = spacy.load("en_core_web_sm")
text = " I-B is one of the sub-committees that advises the Scientific Advisory Group for Emergencies (Sage), led by Sir Patrick Vallance, the chief scientific adviser. This is another sentence"
doc = nlp(text);
assert doc.has_annotation("SENT_START")

sents_array = []
for sent in doc.sents:
    #print(sent.text)
    sents_array.append(sent.text)

tokenizer = nlp.tokenizer

stopwords = ["(", ")", ","]

sents_in_tokens = [ ]

for sent in sents_array:
  #print("Sentence to tokenize: ", sent);
  tokenized_sent = tokenizer.explain(sent)
  tokens_array = []
  for _, word in tokenized_sent:
      if ( word in stopwords):
        continue
      else:
        #print(word)
        tokens_array.append(word)

  sents_in_tokens.append(tokens_array)

docred_data = {}
sents = []
vertexSet = []
labels = []

with open('/home/raphy/jerex/data/datasets/docred_joint/DocREDinputFormat.json', 'w') as json_file:
    try:
        docred_data['title'] = 'title'
        docred_data['sents'] = sents
        docred_data['vertexSet'] = vertexSet
        docred_data['labels'] = labels
        for tokenized_sent in sents_in_tokens:
          docred_data['sents'].append(tokenized_sent)
        json.dump(docred_data, json_file)
    except (KeyError, FileNotFoundError):
        raise EnvironmentError(
            "Your json_file is either missing, or incomplete. "
        )

The resulting DocREDinputFormat.json is the following: https://codebeautify.org/online-json-editor/cbec9560

test.yaml :

dataset:
  #test_path: ./data/datasets/docred_joint/test_joint.json
  test_path: ./data/datasets/docred_joint/DocREDinputFormat.json

model:
  model_path: ./data/models/docred_joint/joint_multi_instance/model.ckpt
  tokenizer_path: ./data/models/docred_joint/joint_multi_instance
  encoder_config_path: ./data/models/docred_joint/joint_multi_instance

inference:
  test_batch_size: 1

distribution:
  gpus: []
  accelerator: ''
  prepare_data_per_node: false

hydra:
  run:
    dir: ./data/runs/${now:%Y-%m-%d}/${now:%H-%M-%S}
  output_subdir: run_config

I get this error:

(base) raphy@pc:~/jerex$ python ./jerex_test.py --config-path configs/docred_joint
./jerex_test.py:28: UserWarning: 
'test' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
  train()
dataset:
  test_path: ./data/datasets/docred_joint/DocREDinputFormat.json
model:
  model_path: ./data/models/docred_joint/joint_multi_instance/model.ckpt
  tokenizer_path: ./data/models/docred_joint/joint_multi_instance
  encoder_config_path: ./data/models/docred_joint/joint_multi_instance
  mention_threshold: null
  coref_threshold: null
  rel_threshold: null
inference:
  valid_batch_size: 1
  test_batch_size: 1
  max_spans: null
  max_coref_pairs: null
  max_rel_pairs: null
distribution:
  gpus: []
  accelerator: ''
  prepare_data_per_node: false
misc:
  store_predictions: false
  store_examples: false
  flush_logs_every_n_steps: 1000
  log_every_n_steps: 1000
  deterministic: false
  seed: null
  cache_path: null
  precision: 32
  profiler: null
  final_valid_evaluate: false

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
Parse dataset '/home/raphy/jerex/data/datasets/docred_joint/DocREDinputFormat.json':   0%|                                                                                            | 0/4 [00:00<?, ?it/s]
Error executing job with overrides: []
Traceback (most recent call last):
  File "./jerex_test.py", line 28, in <module>
    train()
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/main.py", line 49, in decorated_main
    _run_hydra(
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 367, in _run_hydra
    run_and_report(
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 214, in run_and_report
    raise ex
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
    return func()
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 368, in <lambda>
    lambda: hydra.run(
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 110, in run
    _ = ret.return_value
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/core/utils.py", line 233, in return_value
    raise self._return_value
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/core/utils.py", line 160, in run_job
    ret.return_value = task_function(task_cfg)
  File "./jerex_test.py", line 24, in train
    model.test(cfg)
  File "/home/raphy/jerex/jerex/model.py", line 388, in test
    data_module.setup('test')
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/core/datamodule.py", line 384, in wrapped_fn
    return fn(*args, **kwargs)
  File "/home/raphy/jerex/jerex/data_module.py", line 103, in setup
    self._test_dataset = DocREDDataset(dataset_path=self._test_path,
  File "/home/raphy/jerex/jerex/datasets.py", line 50, in __init__
    self._parse_dataset(dataset_path)
  File "/home/raphy/jerex/jerex/datasets.py", line 61, in _parse_dataset
    self._parse_document(document)
  File "/home/raphy/jerex/jerex/datasets.py", line 64, in _parse_document
    title = doc['title']
TypeError: string indices must be integers
(base) raphy@pc:~/jerex$ 

How to fix the building of the json file?

markus-eberts commented 3 years ago

Hi @raphael10-collab

Your JSON file must contain a list of documents (as in the original DocRED datasets). So just encapsulate your docred_data in a list (e.g. json.dump([docred_data], json_file)) and you should be fine.

raphael10-collab commented 3 years ago

After encapsultaing docred_data within a list, I get these errors :

(base) raphy@pc:~/jerex$ python ./jerex_test.py --config-path configs/docred_joint
./jerex_test.py:28: UserWarning: 
'test' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
  train()
dataset:
  test_path: ./data/datasets/docred_joint/DocREDinputFormat.json
model:
  model_path: ./data/models/docred_joint/joint_multi_instance/model.ckpt
  tokenizer_path: ./data/models/docred_joint/joint_multi_instance
  encoder_config_path: ./data/models/docred_joint/joint_multi_instance
  mention_threshold: null
  coref_threshold: null
  rel_threshold: null
inference:
  valid_batch_size: 1
  test_batch_size: 1
  max_spans: null
  max_coref_pairs: null
  max_rel_pairs: null
distribution:
  gpus: []
  accelerator: ''
  prepare_data_per_node: false
misc:
  store_predictions: false
  store_examples: false
  flush_logs_every_n_steps: 1000
  log_every_n_steps: 1000
  deterministic: false
  seed: null
  cache_path: null
  precision: 32
  profiler: null
  final_valid_evaluate: false

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
Parse dataset '/home/raphy/jerex/data/datasets/docred_joint/DocREDinputFormat.json': 100%|█| 1/1 [00
Testing: 0it [00:00, ?it/s]/home/raphy/anaconda3/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)
  return torch.floor_divide(self, other)
Testing: 100%|████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  7.65it/s]/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/deprecated_api.py:81: LightningDeprecationWarning: Internal: `use_ddp` is deprecated in v1.2 and will be removed in v1.4.
  rank_zero_deprecation("Internal: `use_ddp` is deprecated in v1.2 and will be removed in v1.4.")
Evaluation

--- Entity Mentions ---

/home/raphy/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 due to no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/home/raphy/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 due to no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/home/raphy/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 due to no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
                type    precision       recall     f1-score      support
              Binary         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0

--- Clusters (Coreference Resolution) ---

/home/raphy/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 due to no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/home/raphy/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 due to no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/home/raphy/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 due to no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
                type    precision       recall     f1-score      support
              Binary         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0

--- Entities ---

/home/raphy/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/home/raphy/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 due to no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/home/raphy/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
                type    precision       recall     f1-score      support
                 PER         0.00         0.00         0.00          0.0
                 ORG         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0

--- Relations ---
Without entity type

Error executing job with overrides: []
Traceback (most recent call last):
  File "./jerex_test.py", line 28, in <module>
    train()
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/main.py", line 49, in decorated_main
    _run_hydra(
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 367, in _run_hydra
    run_and_report(
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 214, in run_and_report
    raise ex
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
    return func()
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/_internal/utils.py", line 368, in <lambda>
    lambda: hydra.run(
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 110, in run
    _ = ret.return_value
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/core/utils.py", line 233, in return_value
    raise self._return_value
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/hydra/core/utils.py", line 160, in run_job
    ret.return_value = task_function(task_cfg)
  File "./jerex_test.py", line 24, in train
    model.test(cfg)
  File "/home/raphy/jerex/jerex/model.py", line 389, in test
    trainer.test(model, datamodule=data_module)
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 581, in test
    results = self._run(model)
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 758, in _run
    self.dispatch()
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 795, in dispatch
    self.accelerator.start_evaluating(self)
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 99, in start_evaluating
    self.training_type_plugin.start_evaluating(trainer)
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 148, in start_evaluating
    self._results = trainer.run_stage()
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 806, in run_stage
    return self.run_evaluate()
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1049, in run_evaluate
    eval_loop_results = self.run_evaluation()
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 993, in run_evaluation
    self.evaluation_loop.evaluation_epoch_end(outputs)
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 208, in evaluation_epoch_end
    model.test_epoch_end(outputs)
  File "/home/raphy/jerex/jerex/model.py", line 155, in test_epoch_end
    metrics = self._evaluator.compute_metrics(self._eval_test_gt, predictions)
  File "/home/raphy/jerex/jerex/evaluation/joint_evaluator.py", line 97, in compute_metrics
    rel_eval = scoring.score(gt_relations, pred_relations, type_idx=2, print_results=True)
  File "/home/raphy/jerex/jerex/evaluation/scoring.py", line 47, in score
    labels, labels_str = zip(*[(l.index, l.short_name) for l in labels])
ValueError: not enough values to unpack (expected 2, got 0)
Exception ignored in: <function tqdm.__del__ at 0x7fc4a7bc4af0>
Traceback (most recent call last):
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/tqdm/std.py", line 1145, in __del__
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/tqdm/std.py", line 1299, in close
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/tqdm/std.py", line 1492, in display
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/tqdm/std.py", line 1148, in __str__
  File "/home/raphy/anaconda3/lib/python3.8/site-packages/tqdm/std.py", line 1450, in format_dict
TypeError: cannot unpack non-iterable NoneType object
(base) raphy@pc:~/jerex$ 
markus-eberts commented 3 years ago

Thanks for noticing. I added a corner case (= no predictions and ground truth) handling in commit 569456298b1099a51225fc31085754e3ee7cf787. Does it work for you?

raphael10-collab commented 3 years ago

This is the output:

(base) raphy@pc:~/jerex$ python ./jerex_test.py --config-path configs/docred_joint
./jerex_test.py:24: UserWarning: 
'test' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
  train()
dataset:
  test_path: ./data/datasets/docred_joint/DocREDinputFormat.json
model:
  model_path: ./data/models/docred_joint/joint_multi_instance/model.ckpt
  tokenizer_path: ./data/models/docred_joint/joint_multi_instance
  encoder_config_path: ./data/models/docred_joint/joint_multi_instance
  mention_threshold: null
  coref_threshold: null
  rel_threshold: null
inference:
  valid_batch_size: 1
  test_batch_size: 1
  max_spans: null
  max_coref_pairs: null
  max_rel_pairs: null
distribution:
  gpus: []
  accelerator: ''
  prepare_data_per_node: false
misc:
  store_predictions: false
  store_examples: false
  flush_logs_every_n_steps: 1000
  log_every_n_steps: 1000
  deterministic: false
  seed: null
  cache_path: null
  precision: 32
  profiler: null
  final_valid_evaluate: false

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
Parse dataset '/home/raphy/jerex/data/datasets/docred_joint/DocREDinputFormat.json': 100%|█| 1/1 [00:00<00:00, 483.72it/s]
Testing: 0it [00:00, ?it/s]/home/raphy/anaconda3/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)
  return torch.floor_divide(self, other)
Testing: 100%|██████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  8.42it/s]/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/deprecated_api.py:81: LightningDeprecationWarning: Internal: `use_ddp` is deprecated in v1.2 and will be removed in v1.4.
  rank_zero_deprecation("Internal: `use_ddp` is deprecated in v1.2 and will be removed in v1.4.")
Evaluation

--- Entity Mentions ---

                type    precision       recall     f1-score      support
              Binary         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0

--- Clusters (Coreference Resolution) ---

                type    precision       recall     f1-score      support
              Binary         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0

--- Entities ---

                type    precision       recall     f1-score      support
                 PER         0.00         0.00         0.00          0.0
                 ORG         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0

--- Relations ---
Without entity type

                type    precision       recall     f1-score      support
                None         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0

With entity type

                type    precision       recall     f1-score      support
                None         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0
/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/deprecated_api.py:81: LightningDeprecationWarning: Internal: `use_ddp` is deprecated in v1.2 and will be removed in v1.4.
  rank_zero_deprecation("Internal: `use_ddp` is deprecated in v1.2 and will be removed in v1.4.")
Testing: 100%|██████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  6.48it/s]
TEST Profiler Report

Action                              |  Mean duration (s)    |Num calls          |  Total time (s)   |  Percentage %     |
---------------------------------------------------------------------------------------------------------------------------------------
Total                               |  -                |_                  |  0.28121          |  100 %            |
---------------------------------------------------------------------------------------------------------------------------------------
evaluation_step_and_end             |  0.076942         |1                  |  0.076942         |  27.361           |
test_step                           |  0.076789         |1                  |  0.076789         |  27.306           |
cache_result                        |  3.7033e-05       |11                 |  0.00040736       |  0.14486          |
on_test_batch_end                   |  0.00038818       |1                  |  0.00038818       |  0.13804          |
on_test_end                         |  0.00018897       |1                  |  0.00018897       |  0.067198         |
on_test_start                       |  0.00015455       |1                  |  0.00015455       |  0.054957         |
on_test_batch_start                 |  5.7719e-05       |1                  |  5.7719e-05       |  0.020525         |
test_step_end                       |  1.4702e-05       |1                  |  1.4702e-05       |  0.005228         |
on_test_epoch_end                   |  1.4433e-05       |1                  |  1.4433e-05       |  0.0051324        |
on_epoch_start                      |  8.963e-06        |1                  |  8.963e-06        |  0.0031872        |
on_epoch_end                        |  7.756e-06        |1                  |  7.756e-06        |  0.002758         |
on_test_epoch_start                 |  6.67e-06         |1                  |  6.67e-06         |  0.0023719        |
on_test_dataloader                  |  5.963e-06        |1                  |  5.963e-06        |  0.0021204        |
on_before_accelerator_backend_setup |  5.951e-06        |1                  |  5.951e-06        |  0.0021162        |

--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{'coref_f1_macro': 0.0,
 'coref_f1_micro': 0.0,
 'coref_prec_macro': 0.0,
 'coref_prec_micro': 0.0,
 'coref_rec_macro': 0.0,
 'coref_rec_micro': 0.0,
 'entity_f1_macro': 0.0,
 'entity_f1_micro': 0.0,
 'entity_prec_macro': 0.0,
 'entity_prec_micro': 0.0,
 'entity_rec_macro': 0.0,
 'entity_rec_micro': 0.0,
 'mention_f1_macro': 0.0,
 'mention_f1_micro': 0.0,
 'mention_prec_macro': 0.0,
 'mention_prec_micro': 0.0,
 'mention_rec_macro': 0.0,
 'mention_rec_micro': 0.0,
 'rel_f1_macro': 0.0,
 'rel_f1_micro': 0.0,
 'rel_nec_f1_macro': 0.0,
 'rel_nec_f1_micro': 0.0,
 'rel_nec_prec_macro': 0.0,
 'rel_nec_prec_micro': 0.0,
 'rel_nec_rec_macro': 0.0,
 'rel_nec_rec_micro': 0.0,
 'rel_prec_macro': 0.0,
 'rel_prec_micro': 0.0,
 'rel_rec_macro': 0.0,
 'rel_rec_micro': 0.0}
--------------------------------------------------------------------------------

image

I wonder why no relations at all.

Even aking into account a whole text, the relations are not extracted:

(base) raphy@pc:~/jerex$ python jerex_test.py --config-path configs/docred_joint
jerex_test.py:24: UserWarning: 
'test' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
  train()
dataset:
  test_path: ./data/datasets/docred_joint/DocREDinputFormat.json
model:
  model_path: ./data/models/docred_joint/joint_multi_instance/model.ckpt
  tokenizer_path: ./data/models/docred_joint/joint_multi_instance
  encoder_config_path: ./data/models/docred_joint/joint_multi_instance
  mention_threshold: null
  coref_threshold: null
  rel_threshold: null
inference:
  valid_batch_size: 1
  test_batch_size: 1
  max_spans: null
  max_coref_pairs: null
  max_rel_pairs: null
distribution:
  gpus: []
  accelerator: ''
  prepare_data_per_node: false
misc:
  store_predictions: false
  store_examples: false
  flush_logs_every_n_steps: 1000
  log_every_n_steps: 1000
  deterministic: false
  seed: null
  cache_path: null
  precision: 32
  profiler: null
  final_valid_evaluate: false

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
Parse dataset '/home/raphy/jerex/data/datasets/docred_joint/DocREDinputFormat.json': 100%|██████████████████████████████████| 1/1 [00:00<00:00, 39.11it/s]
Testing: 0it [00:00, ?it/s]/home/raphy/anaconda3/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)
  return torch.floor_divide(self, other)
Testing: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.68s/it]/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/deprecated_api.py:81: LightningDeprecationWarning: Internal: `use_ddp` is deprecated in v1.2 and will be removed in v1.4.
  rank_zero_deprecation("Internal: `use_ddp` is deprecated in v1.2 and will be removed in v1.4.")
Evaluation

--- Entity Mentions ---

                type    precision       recall     f1-score      support
              Binary         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0

--- Clusters (Coreference Resolution) ---

                type    precision       recall     f1-score      support
              Binary         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0

--- Entities ---

                type    precision       recall     f1-score      support
                TIME         0.00         0.00         0.00          0.0
                 PER         0.00         0.00         0.00          0.0
                 ORG         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0

--- Relations ---
Without entity type

                type    precision       recall     f1-score      support
                None         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0

With entity type

                type    precision       recall     f1-score      support
                None         0.00         0.00         0.00          0.0

               micro         0.00         0.00         0.00          0.0
               macro         0.00         0.00         0.00          0.0
/home/raphy/anaconda3/lib/python3.8/site-packages/pytorch_lightning/trainer/deprecated_api.py:81: LightningDeprecationWarning: Internal: `use_ddp` is deprecated in v1.2 and will be removed in v1.4.
  rank_zero_deprecation("Internal: `use_ddp` is deprecated in v1.2 and will be removed in v1.4.")
Testing: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.72s/it]
TEST Profiler Report

Action                              |  Mean duration (s)    |Num calls          |  Total time (s)   |  Percentage %     |
---------------------------------------------------------------------------------------------------------------------------------------
Total                               |  -                |_                  |  1.8697           |  100 %            |
---------------------------------------------------------------------------------------------------------------------------------------
evaluation_step_and_end             |  1.6065           |1                  |  1.6065           |  85.921           |
test_step                           |  1.6063           |1                  |  1.6063           |  85.913           |
cache_result                        |  4.0043e-05       |11                 |  0.00044048       |  0.023558         |
on_test_batch_end                   |  0.00033471       |1                  |  0.00033471       |  0.017902         |
on_test_end                         |  0.00020894       |1                  |  0.00020894       |  0.011175         |
on_test_start                       |  0.00016495       |1                  |  0.00016495       |  0.0088221        |
on_test_batch_start                 |  6.7178e-05       |1                  |  6.7178e-05       |  0.003593         |
on_epoch_start                      |  1.7505e-05       |1                  |  1.7505e-05       |  0.00093624       |
test_step_end                       |  1.6042e-05       |1                  |  1.6042e-05       |  0.00085799       |
on_test_epoch_end                   |  1.5931e-05       |1                  |  1.5931e-05       |  0.00085206       |
on_epoch_end                        |  1.0722e-05       |1                  |  1.0722e-05       |  0.00057346       |
on_before_accelerator_backend_setup |  7.64e-06         |1                  |  7.64e-06         |  0.00040862       |
on_test_epoch_start                 |  7.052e-06        |1                  |  7.052e-06        |  0.00037717       |
on_test_dataloader                  |  6.074e-06        |1                  |  6.074e-06        |  0.00032486       |

--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{'coref_f1_macro': 0.0,
 'coref_f1_micro': 0.0,
 'coref_prec_macro': 0.0,
 'coref_prec_micro': 0.0,
 'coref_rec_macro': 0.0,
 'coref_rec_micro': 0.0,
 'entity_f1_macro': 0.0,
 'entity_f1_micro': 0.0,
 'entity_prec_macro': 0.0,
 'entity_prec_micro': 0.0,
 'entity_rec_macro': 0.0,
 'entity_rec_micro': 0.0,
 'mention_f1_macro': 0.0,
 'mention_f1_micro': 0.0,
 'mention_prec_macro': 0.0,
 'mention_prec_micro': 0.0,
 'mention_rec_macro': 0.0,
 'mention_rec_micro': 0.0,
 'rel_f1_macro': 0.0,
 'rel_f1_micro': 0.0,
 'rel_nec_f1_macro': 0.0,
 'rel_nec_f1_micro': 0.0,
 'rel_nec_prec_macro': 0.0,
 'rel_nec_prec_micro': 0.0,
 'rel_nec_rec_macro': 0.0,
 'rel_nec_rec_micro': 0.0,
 'rel_prec_macro': 0.0,
 'rel_prec_micro': 0.0,
 'rel_rec_macro': 0.0,
 'rel_rec_micro': 0.0}
--------------------------------------------------------------------------------
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/raphy/anaconda3/lib/python3.8/threading.py", line 932, in _bootstrap_inner
(base) raphy@pc:~/jerex$ 

image

How could I tune the model in order to extract the relations as well?

markus-eberts commented 3 years ago

It's hard to tell why the model did not predict any relations in this case. Generally, relations such as 'employer' (which are suitable in your example) have a pretty low recall in the DocRED dataset (at least for JEREX). This can have many reasons, e.g. an imbalanced training dataset, noisy annotations, insufficient model architecture, hyperparameter settings.... Generally, performance of current state-of-the-art models on the DocRED dataset is quite limited (around 60% F1 score for BERT-based models). You may achieve a better performance by labeling a dataset with a subset of relations that you are interested in (but this is very time consuming obviously).