CodedotAl / gpt-code-clippy

Full description can be found here: https://discuss.huggingface.co/t/pretrain-gpt-neo-for-open-source-github-copilot-model/7678?u=ncoop57
Apache License 2.0
3.29k stars 220 forks source link

Huggingface example #79

Closed Penguin-jpg closed 2 years ago

Penguin-jpg commented 2 years ago

Hello, I found this amazing repository today. I tried to run the example found in the huggingface on google colab but it didn't output anything except "Setting pad_token_id to eos_token_id:50256 for open-end generation.". I want to know if there is anything I did wrong. Thanks! (sorry for my poor english)

this is the example: image

this the code I ran on colab( I change the variable device to "cpu"): image

arampacha commented 2 years ago

Hi, this is a trivial notification, nothing to worry about. generate method used pad_token_id to "understand" that output is and break the generation. If no it's not defined for tokenizer, eos_token_id is used. You should be able avoid this notification by either setting it manually:

tokenizer.pad_token_id = tokenizer.eos_token_id

or by passing it to generate call:

model.generate(..., pad_token_id=tokenizer.eos_token_id