ArneBinder / pytorch-ie

PyTorch-IE: State-of-the-art Information Extraction in PyTorch
MIT License
74 stars 8 forks source link

Error calling example RE pipeline: AssertionError: No argument markers available, was `prepare` already called? #259

Closed slbayer closed 1 year ago

slbayer commented 1 year ago

Version 0.13.0. Sample code is essentially identical to https://huggingface.co/spaces/pie/Joint-NER-and-Relation-Extraction/blob/main/app.py:

def main(text):

    ner_model_name_or_path = "pie/example-ner-spanclf-conll03"
    re_model_name_or_path = "pie/example-re-textclf-tacred"

    ner_pipeline = AutoPipeline.from_pretrained(ner_model_name_or_path, device=-1)
    re_pipeline = AutoPipeline.from_pretrained(re_model_name_or_path, device=-1)

    document = ExampleDocument(text)

    ner_pipeline(document)

    while len(document.entities.predictions) > 0:
        entity = document.entities.predictions.pop(0)
        print(entity)
        document.entities.append(entity)

    re_pipeline(document)
    for relation in document.relations.predictions:
        print(relation.head, relation.tail, relation.label)

Running this on a simple text document gives me an error in re_pipeline():

Traceback (most recent call last):
  File "/Users/sam/Projects/KBP-FtMeade-22/extractors/pytorch-ie/test.py", line 53, in <module>
    main(txt)
  File "/Users/sam/Projects/KBP-FtMeade-22/extractors/pytorch-ie/test.py", line 37, in main
    re_pipeline(document)
  File "/Users/sam/Projects/KBP-FtMeade-22/extractors/pytorch-ie/pytorch-ie-venv/lib/python3.9/site-packages/pytorch_ie/pipeline.py", line 342, in __call__
    model_inputs = self.preprocess(documents, **preprocess_params)
  File "/Users/sam/Projects/KBP-FtMeade-22/extractors/pytorch-ie/pytorch-ie-venv/lib/python3.9/site-packages/pytorch_ie/pipeline.py", line 216, in preprocess
    encodings = self.taskmodule.encode(
  File "/Users/sam/Projects/KBP-FtMeade-22/extractors/pytorch-ie/pytorch-ie-venv/lib/python3.9/site-packages/pytorch_ie/core/taskmodule.py", line 339, in encode
    cur_task_encodings, cur_documents_in_order = self.batch_encode(
  File "/Users/sam/Projects/KBP-FtMeade-22/extractors/pytorch-ie/pytorch-ie-venv/lib/python3.9/site-packages/pytorch_ie/core/taskmodule.py", line 247, in batch_encode
    task_encodings, documents_in_order = self.encode_inputs(
  File "/Users/sam/Projects/KBP-FtMeade-22/extractors/pytorch-ie/pytorch-ie-venv/lib/python3.9/site-packages/pytorch_ie/core/taskmodule.py", line 378, in encode_inputs
    possible_task_encodings = self.encode_input(document, is_training)
  File "/Users/sam/Projects/KBP-FtMeade-22/extractors/pytorch-ie/pytorch-ie-venv/lib/python3.9/site-packages/pytorch_ie/taskmodules/transformer_re_text_classification.py", line 259, in encode_input
    assert (
AssertionError: No argument markers available, was `prepare` already called?

My setup is Python 3.9 on MacOS. Clean venv in which I ran

pip install pytorch-ie

Encountered an error when importing pytorch-ie in my code:

ImportError: cannot import name 'CommitOperationAdd' from 'huggingface_hub' (/[deleted]/pytorch-ie-venv/lib/python3.9/site-packages/huggingface_hub/__init__.py)

huggingface_hub version was 0.5.1, ran pip install --upgrade huggingface-hub, which warned me that pytorch-ie expected nothing higher than 0.5.1, but got me past that error, only to hit the assertion error above.

Thanks in advance.

slbayer commented 1 year ago

I dug into this some more. The argument markers are updated in the _post_prepare() method of the task module, which is called by prepare(). But when I call prepare() directly, it complains that The taskmodule is already prepared, do not prepare again.. But it goes on to run _post_prepare(), and so the process continues without error. There's something messed up in the logic for this task module, I think.

ArneBinder commented 1 year ago

@slbayer Thanks a lot for reporting this! It is indeed a bug caused by a recent refactor. It should be fixed by #260. Can you try installing this branch and check if it works for you?