PrithivirajDamodaran / Parrot_Paraphraser

A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.
Apache License 2.0
866 stars 141 forks source link

TypeError: Descriptors cannot not be created directly. #45

Closed ct2034 closed 1 year ago

ct2034 commented 1 year ago

When running the Quickstart example from the readme, I get:

Traceback (most recent call last):
  File "$HOME/src/try-parrot/demo.py", line 19, in <module>
    parrot = Parrot(model_tag="prithivida/parrot_paraphraser_on_T5", use_gpu=False)
  File "$HOME/.local/lib/python3.8/site-packages/parrot/parrot.py", line 10, in __init__
    self.tokenizer = AutoTokenizer.from_pretrained(model_tag, use_auth_token=False)
  File "$HOME/.local/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 659, in from_pretrained
    return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "$HOME/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1801, in from_pretrained
    return cls._from_pretrained(
  File "$HOME/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1956, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "$HOME/.local/lib/python3.8/site-packages/transformers/models/t5/tokenization_t5_fast.py", line 133, in __init__
    super().__init__(
  File "$HOME/.local/lib/python3.8/site-packages/transformers/tokenization_utils_fast.py", line 114, in __init__
    fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
  File "$HOME/.local/lib/python3.8/site-packages/transformers/convert_slow_tokenizer.py", line 1162, in convert_slow_tokenizer
    return converter_class(transformer_tokenizer).converted()
  File "$HOME/.local/lib/python3.8/site-packages/transformers/convert_slow_tokenizer.py", line 438, in __init__
    from .utils import sentencepiece_model_pb2 as model_pb2
  File "$HOME/.local/lib/python3.8/site-packages/transformers/utils/sentencepiece_model_pb2.py", line 92, in <module>
    _descriptor.EnumValueDescriptor(
  File "$HOME/.local/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 755, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
ct2034 commented 1 year ago

The first option worked for me: pip install protobuf==3.20.0

PrithivirajDamodaran commented 1 year ago

I added a new demo notebook, and everything works fine. (check for the link in Readme)