dmmiller612 / bert-extractive-summarizer

Easy to use extractive text summarization with BERT
MIT License
1.38k stars 305 forks source link

Is there an offline way to download summarizer? #19

Open zysNLP opened 4 years ago

zysNLP commented 4 years ago

When I execute from summarizer import Summarizer, it' s too slow to download. I would like to ask if you could provide a url to download this part of content offline and save it somewhere?

zysNLP commented 4 years ago

When "from summarizer import Summarizer" done, came "model = Summarizer()" !...This part could not download indeed. I want to give up...

dmmiller612 commented 4 years ago

You will have to download at least once from the huggingface s3 repo. Once it is downloaded, it is cached and can be used offline. You can also download/train it separately and use it as a pretrained model.

zysNLP commented 4 years ago

But how to download it separately?

lapplislazuli commented 4 years ago

@zysNLP It seems to be using "bert-large-uncased", so maybe you can just put your pre-downloaded version at the point where the downloaded one ends up?

The download asks for https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-pytorch_model.bin where i can also download it manually

zysNLP commented 4 years ago

@Twonki Thank you very much, but where should I put the downloaded files?

lapplislazuli commented 4 years ago

@zysNLP I found some cached torch files under C:\Users\XXX\.cache\torch\transformers

These are rather cryptic hashes, but one is 1gb big which is about bert, and the time written matches. Overall there are 6 hashname-files. I´m not using torch for anything else on this machine

the big one is hashed: '54da47087cc86ce75324e4dc9bbb5f66c6e83a7c6bd23baea8b489acc8d09aa4.4d5343a4b979c4beeaadef17a0453d1bb183dd9b084f58b84c7cc781df343ae6' if you want I can name the others too

lapplislazuli commented 4 years ago

@zysNLP I´m not 100% sure, but i think it´s just the normal spacy-install of the model.

You can maybe try the spacy-documentation on manual installation.

@dmmiller612 can you give a short comment on torch vs. spacy or whether I´m pointing in completely the wrong direction?

zysNLP commented 4 years ago

@Twonki Thank you! but I use ubuntu, and found nothing about .cache\torch\transformers. Maybe use command to download is well even though need a lot of time.

hohlb commented 3 years ago

You can define the cache directory by setting paths via the environment variables TRANSFORMERS_CACHE or TORCH_HOME.