Error running Pipeline with BasicReferenceRecognizer

xesaad commented 2 years ago

Hi there! I am a new and frequent user of this great package, which also comes with a few inevitable GitHub issues 😅

When I initialize the pipeline as follows:

name = "absa/classifier-rest-0.2"
model = absa.BertABSClassifier.from_pretrained(name)
tokenizer = BertTokenizer.from_pretrained(name)
reference_recognizer = absa.aux_models.BasicReferenceRecognizer()
professor = absa.Professor(reference_recognizer) 
nlp = absa.Pipeline(model=model, tokenizer=tokenizer, professor=professor)

I receive the following error:

TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_514/72277120.py in <module>
      2 model = absa.BertABSClassifier.from_pretrained(name)
      3 tokenizer = BertTokenizer.from_pretrained(name)
----> 4 reference_recognizer = absa.aux_models.BasicReferenceRecognizer()
      5 professor = absa.Professor(reference_recognizer)
      6 nlp = absa.Pipeline(model=model, tokenizer=tokenizer, professor=professor)

TypeError: __init__() missing 1 required positional argument: 'weights'

I realise this is because the BasicReferenceRecognizer needs to be trained in order to select weights. This leads me to two questions/issues:

The BasicReferenceRecognizer class has no train method. Is there another way in which to train it, or any ways to load a pretrained model from the package? From the unit tests for the BasicReferenceRecognizer I found there were two pre-trained models, 'absa/basic_reference_recognizer-rest-0.1' and 'absa/basic_reference_recognizer-lapt-0.1', but on trying to initialize with these I received an ImportError.

I also tried directly initializing the BasicReferenceRecognizer with weights=(-0.025, 44) as is done in this line. However, upon making predictions I get an error in the Pipeline at the postprocess step:


TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_514/3162923628.py in <module>
  3 for row in df.itertuples():
  4     print(row)
----> 5     prediction = predict(row.Review, row.Aspect)
  6     sentiment = get_sentiment(prediction)
  7     certainty_score = get_certainty_score(prediction)

/tmp/ipykernel_514/1002360698.py in predict(text, aspect) 16 output_batch = nlp.predict(input_batch) 17 predictions = nlp.review(tokenized_examples, output_batch) ---> 18 completed_task = nlp.postprocess(task, predictions) 19 completed_subtask = completed_task.subtasks[aspect] 20 return completed_subtask

/pyenv/versions/3.8.5/envs/seo-advice-page/lib/python3.8/site-packages/aspect_based_sentiment_analysis/pipelines.py in postprocess(task, batch_examples) 301 aspect, = {e.aspect for e in examples} 302 scores = np.max([e.scores for e in examples], axis=0) --> 303 scores /= np.linalg.norm(scores, ord=1) 304 sentiment_id = np.argmax(scores).astype(int) 305 aspect_document = CompletedSubTask(

TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''


I believe that this error is related to a `TypeError` between `int` and `float`. If instead I initialize with `weights = (1,1)`, for example, I receive no error.

I wanted to flag these issues for your awareness. Thank you very much for any advice you can provide 😄

xesaad commented 2 years ago

Update: I believe that this issue is due to the following: if the BasicReferenceRecognizer does not detect an aspect, the professor component sets scores = [0,0,0], which is a list of integers. When scores is then normalised by dividing by its norm, the error is raised because you are dividing an int when you really want to divide a float (of course, there may also be a ZeroDivisionError lurking here!)

Suggestion:

In this line, redefine scores = [0.0, 0.0, 0.0].
For extra caution, in this line define scores = np.max([e.scores for e in examples], axis=0).astype(float).

I tried to open a PR to fix these suggestions myself, but unfortunately I don't have permission to push to this repository. I hope that these suggestions help with resolving this issue!

abaveja313 commented 1 year ago

Thank you for the suggestion! Fixed my problem @xesaad

ScalaConsultants / Aspect-Based-Sentiment-Analysis

Error running Pipeline with BasicReferenceRecognizer #60