IntelLabs / nlp-architect

A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
https://intellabs.github.io/nlp-architect
Apache License 2.0
2.94k stars 448 forks source link

bug: ABSA: Passing spacy_model="fr_core_news_sm" to SentimentInference gives an error #201

Closed shadiakiki1986 closed 3 years ago

shadiakiki1986 commented 3 years ago

Describe the bug When using ABSA, passing spacy_model="fr_core_news_sm" to SentimentInference gives an error Model/procedure: aspect-based sentiment analysis

To Reproduce Steps to reproduce the behavior:

# in terminal
python -m spacy download fr_core_news_sm

# in python
from nlp_architect.models.absa.inference.inference import SentimentInference
inference = SentimentInference("aspects.csv", "opinions.csv", spacy_model="fr_core_news_sm")
inference.run(doc="je suis")

Expected behavior Get results as if spacy_model is skipped (default is en_core_web_sm)

Environment setup:

Additional context I'm trying to run ABSA on french text

Update: error output is:

ValueError                                Traceback (most recent call last)
<ipython-input-9-0dbb8e611f69> in <module>()
----> 1 inference.run(doc="je suis")

5 frames
/usr/local/lib/python3.6/dist-packages/nlp_architect/models/absa/inference/inference.py in run(self, doc, parsed_doc)
    110             if not self.parser:
    111                 raise RuntimeError("Parser not initialized (try parse=True at init)")
--> 112             parsed_doc = self.parser.parse([doc])[0]
    113 
    114         sentiment_doc = None

/usr/local/lib/python3.6/dist-packages/nlp_architect/utils/text.py in parse(self, texts, output_dir)
    243         """
    244         if self.n_jobs == 1:
--> 245             return self.process_batch(texts, output_dir)
    246         partitions = minibatch(texts, size=self.batch_size)
    247         executor = Parallel(n_jobs=self.n_jobs, backend="multiprocessing", prefer="processes")

/usr/local/lib/python3.6/dist-packages/nlp_architect/utils/text.py in process_batch(self, texts, output_dir, batch_id)
    256                 doc
    257                 if self.spacy_doc
--> 258                 else CoreNLPDoc.from_spacy(doc, self.show_tok, self.show_doc, self.ptb_pos)
    259             )
    260             parsed_docs.append(parsed_doc)

/usr/local/lib/python3.6/dist-packages/nlp_architect/common/core_nlp_doc.py in from_spacy(spacy_doc, show_tok, show_doc, ptb_pos)
    240             cur_sent = []
    241             for tok in spacy_sent:
--> 242                 pos = _spacy_pos_to_ptb(tok.tag_, tok.text) if ptb_pos else tok.tag_
    243                 core_tok = {
    244                     "start": tok.idx,

/usr/local/lib/python3.6/dist-packages/nlp_architect/common/core_nlp_doc.py in _spacy_pos_to_ptb(pos, text)
     64         ptb_tag (str): Standard PTB POS tag.
     65     """
---> 66     validate((pos, str, 0, 30), (text, str, 0, 1000))
     67     ptb_tag = pos
     68     if text in ["...", "—"]:

/usr/local/lib/python3.6/dist-packages/nlp_architect/utils/io.py in validate(*args)
    175                 raise ValueError("{} {} must be greater or equal to {}".format(val, name, arg_min))
    176             if arg_max is not None and num >= arg_max:
--> 177                 raise ValueError("{} {} must be less than {}".format(val, name, arg_max))
    178 
    179 

ValueError: Length  must be less than 30
shadiakiki1986 commented 3 years ago

bump

danielkorat commented 3 years ago

Hi @shadiakiki1986 I'm sorry but this ABSA implementation was not built to support non-English languages.