ERROR:pytorch_pretrained_bert.modeling:Model name 'bert-large-uncased' was not found in model name list

dmmiller612 / bert-extractive-summarizer

Easy to use extractive text summarization with BERT

MIT License

1.39k stars 305 forks source link

ERROR:pytorch_pretrained_bert.modeling:Model name 'bert-large-uncased' was not found in model name list #9

Closed Arvedek closed 4 years ago

Arvedek commented 5 years ago

hmm.. any idea how to solve this? Full error here:

ERROR:pytorch_pretrained_bert.modeling:Model name 'bert-large-uncased' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese). We assumed 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased.tar.gz' was a path or url but couldn't find any file associated to this path or url.

Arvedek commented 5 years ago

Even though I change all the model to 'bert-base-uncased', still having this error

dmmiller612 commented 5 years ago

This might be due to the change in huggingface. I will look at updating the api tonight

Arvedek commented 5 years ago

I guess it's because the internet issue... so I download the model myself and get rid of that issue. But I met another one which shows like IndexError: index -2 is out of bounds for dimension 0 with size 1. Change the hidden to 2 and get it works not sure why. Btw, thanks for the reply

dmmiller612 commented 4 years ago

Should be resolved in newest pypi version

FBruzzesi commented 4 years ago

@dmmiller612 Using transformers 2.8.0, spacy 2.2.3 and neuralcoref 4.0.0, I am getting the IndexError above when using a downloaded model and cannot manage to work around it. Can it be version issue?

dmmiller612 commented 4 years ago

what version of bert-extractive-summarizer are you using?

FBruzzesi commented 4 years ago

I am using 0.4.2 version

dmmiller612 commented 4 years ago

So the above issue seems to work fine now. As for your model @FBruzzesi, do you have a link to the model you are trying to use? Long story short, the current implementation takes the second to last decoding layer as embedded output. If the model flattens the decoding layer, it would throw an error here.