Yep, underneath, this uses the hugging face transformers library. So you will have access to all of the pretrained models there.

pratikghanwat7 commented 4 years ago

from summarizer import Summarizer
from transformers import *

d_tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-multilingual-cased')
d_model = DistilBertModel.from_pretrained('distilbert-base-multilingual-cased')

model = Summarizer(custom_model=d_model, custom_tokenizer=d_tokenizer)

Originally posted by @dmmiller612 in https://github.com/dmmiller612/bert-extractive-summarizer/issues/54#issuecomment-632690548

pratikghanwat7 commented 4 years ago

for this multi-lingual model when I tried to feed text for summary I end up with errors: Code:

result = model(body)
full = ''.join(result)
print(full)

Errors:

ValueError                                Traceback (most recent call last)
<ipython-input-59-19e45898d086> in <module>()
     24 Once the competitor could rise no higher, the spire of the Chrysler building was raised into view, giving it the title.
     25 '''
---> 26 result = model(body,ratio=0.2)
     27 full = ''.join(result)
     28 print(full)

/usr/local/lib/python3.6/dist-packages/summarizer/bert_parent.py in extract_embeddings(self, text, hidden, reduce_option)
     82 
     83         tokens_tensor = self.tokenize_input(text)
---> 84         pooled, hidden_states = self.model(tokens_tensor)[-2:]
     85 
     86         if -1 > hidden > -12:

ValueError: not enough values to unpack (expected 2, got 1)

pratikghanwat7 commented 4 years ago

When I added one more parameter to d_moel, issue got resolved.

d_model = DistilBertModel.from_pretrained('distilbert-base-multilingual-cased',output_hidden_states=True)

dmmiller612 / bert-extractive-summarizer

Yep, underneath, this uses the hugging face transformers library. So you will have access to all of the pretrained models there. #56