dmmiller612 / bert-extractive-summarizer

Easy to use extractive text summarization with BERT
MIT License
1.39k stars 305 forks source link

It takes long for the summary to be created #16

Closed hdatteln closed 4 years ago

hdatteln commented 4 years ago

Hi, When trying the out-of-the-box summarizer on an article with about 7000 words, it takes over 4 minutes to get a summary; (I am running the example on my mac, no gpu) Just want to check if that's that normal?

Thanks, Heidi

dmmiller612 commented 4 years ago

It could be a few things. When you first use the worker, a download is issued for the pretrained model. Depending on network speed, that can take a few minutes. Good news is that once it is downloaded, it should be cached in your system.

The next thing could be the model you select. Distill Bert runs much faster than Bert-uncased large, due to the fewer parameters.

GPU is obviously preferable with the size of these models, but can be costly.

hdatteln commented 4 years ago

Thank you, Derek. The download is not the issue here (happens on every run, not only the first); I tried using Distill Bert with model = Summarizer(model='distilbert-base-uncased') . It is indeed much faster, CPU times: user 1min 24s, sys: 13.5 s, total: 1min 37s