PrithivirajDamodaran / Parrot_Paraphraser

A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.
Apache License 2.0
866 stars 141 forks source link

added use auth token according to new hugging face #37

Open LEAGUEDORA opened 1 year ago

LEAGUEDORA commented 1 year ago

Since hugging face has updated their API, you cannot access Parrot models without using an auth token. You may land into this issue

...is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'....

To solve this I added a new parameter called use_auth_token.

How to get a token from Huggingface 1) Open Hugging Face and register/login with your credentials 2) Nativate to Token settings page and create a write permitted access token. 3) Copy the token and pass it as a parameter to Parrot class while initiating.

So the updated code will be

from parrot import Parrot
import torch
import warnings
warnings.filterwarnings("ignore")

''' 
uncomment to get reproducable paraphrase generations
def random_state(seed):
  torch.manual_seed(seed)
  if torch.cuda.is_available():
    torch.cuda.manual_seed_all(seed)

random_state(1234)
'''

#Init models (make sure you init ONLY once if you integrate this to your code)
parrot = Parrot(model_tag="prithivida/parrot_paraphraser_on_T5", use_auth_token = "<Your Hugging Face token>")

phrases = ["Can you recommend some upscale restaurants in Newyork?",
           "What are the famous places we should not miss in Russia?"
]

for phrase in phrases:
  print("-"*100)
  print("Input_phrase: ", phrase)
  print("-"*100)
  para_phrases = parrot.augment(input_phrase=phrase, use_gpu=False)
  for para_phrase in para_phrases:
   print(para_phrase)
chriszhuada commented 1 year ago

Hi there, any updates on when this PR will be merged? TY!

LEAGUEDORA commented 1 year ago

Hi there, any updates on when this PR will be merged? TY!

No. I don't know when it is going to happen.

PrithivirajDamodaran commented 1 year ago

I have unrolled the need for auth tokens, you should be fine now. Try the demo code as-is

On Thu, 1 Dec 2022 at 7:39 PM, Boyinapalli Sandeep Dora < @.***> wrote:

Hi there, any updates on when this PR will be merged? TY!

No. I don't know when it is going to happen.

— Reply to this email directly, view it on GitHub https://github.com/PrithivirajDamodaran/Parrot_Paraphraser/pull/37#issuecomment-1333824511, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABV6KK25KG6V7M3LY4BES6DWLCWRHANCNFSM6AAAAAARATAF34 . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>

-- Sent from a mobile device. Please excuse typos and terse messages.

chriszhuada commented 1 year ago

In my code, I have

      self.parrot = Parrot(
            use_gpu=torch.cuda.is_available(),
        )

and I get the error

    self.parrot = Parrot(
  File "/usr/local/lib/python3.9/dist-packages/parrot/parrot.py", line 10, in __init__
    self.tokenizer = AutoTokenizer.from_pretrained(model_tag, use_auth_token=True)
  File "/usr/local/lib/python3.9/dist-packages/transformers/models/auto/tokenization_auto.py", line 560, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/models/auto/tokenization_auto.py", line 412, in get_tokenizer_config
    resolved_config_file = cached_file(
  File "/usr/local/lib/python3.9/dist-packages/transformers/utils/hub.py", line 409, in cached_file
    resolved_file = hf_hub_download(
  File "/usr/local/lib/python3.9/dist-packages/huggingface_hub/utils/_validators.py", line 124, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/huggingface_hub/file_download.py", line 1054, in hf_hub_download
    headers = build_hf_headers(
  File "/usr/local/lib/python3.9/dist-packages/huggingface_hub/utils/_validators.py", line 124, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/huggingface_hub/utils/_headers.py", line 117, in build_hf_headers
    token_to_send = get_token_to_send(token)
  File "/usr/local/lib/python3.9/dist-packages/huggingface_hub/utils/_headers.py", line 149, in get_token_to_send
    raise EnvironmentError(
OSError: Token is required (`token=True`), but no token found. You need to provide a token or be logged in to Hugging Face with `huggingface-cli login` or `huggingface_hub.login`. See https://huggingface.co/settings/tokens.
chriszhuada commented 1 year ago

Not sure if it matters, but I'm using a poetry project for my project:

parrot = { git = "https://github.com/PrithivirajDamodaran/Parrot_Paraphraser.git", branch = "main" }

chriszhuada commented 1 year ago

Was able to get it work by

from huggingface_hub import login

login(token=HUGGINGFACE_TOKEN)

in my file that imports Parrot!

ucas010 commented 1 year ago

NameError: name 'HUGGINGFACE_TOKEN' is not defined

LEAGUEDORA commented 1 year ago

NameError: name 'HUGGINGFACE_TOKEN' is not defined

Define a constant with like

HUGGINGFACE_TOKEN = 'YOUR_TOKEN'