mgrankin / ru_transformers

Apache License 2.0
776 stars 108 forks source link

Beginner question about the models #15

Closed SugariuClaudiu closed 4 years ago

SugariuClaudiu commented 4 years ago

Hello and thank you for the great work.

This might be a rather stupid question, but I'm just beginning with NPL. Apologies in advance. Could you give me a brief intro with regards to the files generated in the models ?

I am asking this because when I try to load the head model class and the tokenizer class I get the following error: We assumed '../russian_models/gpt2/m_checkpoint-3364613' was a path or url to a directory containing vocabulary files named ['vocab.json', 'merges.txt'] but couldn't find such vocabulary files at this path or url

Of course, those files are not present there, but I'm not sure where to start at the moment. And a second question if possible, are you licensing this project under MIT by any chance?

Thank you in advance.

mgrankin commented 4 years ago

Hello, how did you get this awesome error message and what you're trying to achieve?

SugariuClaudiu commented 4 years ago

I'm trying to generate text based on a prompt input.

MODEL_CLASSES = {'gpt2': (GPT2LMHeadModel, GPT2Tokenizer)}
model_class, tokenizer_class = MODEL_CLASSES['gpt2']
tokenizer = tokenizer_class.from_pretrained('gpt2)
model = model_class.from_pretrained('gpt2')
model.to(device)
model.eval();

And then sample a sequence. I've used this as inspiration: https://github.com/gabrielelanaro/ml-prototypes/blob/master/prototypes/styletransfer/huggingface/huggingface.py

mgrankin commented 4 years ago

I didn't use the default tokeniser, but you're using the default code with default tokeniser and the error is because of that.

To understand how to generate text you should start by looking at rest.py.

SugariuClaudiu commented 4 years ago

Thank you for this. I had a look. However, this way it is too heavy from a computational point of view. I am trying to run it on an EC2 instance in a reasonable amount of time so I can serve it as an API call.

Any suggestions ? And again the question about the LICENCE. Under what terms can I use your models ?

mgrankin commented 4 years ago

Have you tried to run it on a GPU instance?

Apache 2.0

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.