dmmiller612 / bert-extractive-summarizer

Easy to use extractive text summarization with BERT
MIT License
1.38k stars 305 forks source link

Yep, underneath, this uses the hugging face transformers library. So you will have access to all of the pretrained models there. #56

Closed pratikghanwat7 closed 4 years ago

pratikghanwat7 commented 4 years ago

Yep, underneath, this uses the hugging face transformers library. So you will have access to all of the pretrained models there.

from summarizer import Summarizer
from transformers import *

d_tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-multilingual-cased')
d_model = DistilBertModel.from_pretrained('distilbert-base-multilingual-cased')

model = Summarizer(custom_model=d_model, custom_tokenizer=d_tokenizer)

Originally posted by @dmmiller612 in https://github.com/dmmiller612/bert-extractive-summarizer/issues/54#issuecomment-632690548

pratikghanwat7 commented 4 years ago

for this multi-lingual model when I tried to feed text for summary I end up with errors: Code:

result = model(body)
full = ''.join(result)
print(full)

Errors:

ValueError                                Traceback (most recent call last)
<ipython-input-59-19e45898d086> in <module>()
     24 Once the competitor could rise no higher, the spire of the Chrysler building was raised into view, giving it the title.
     25 '''
---> 26 result = model(body,ratio=0.2)
     27 full = ''.join(result)
     28 print(full)

/usr/local/lib/python3.6/dist-packages/summarizer/bert_parent.py in extract_embeddings(self, text, hidden, reduce_option)
     82 
     83         tokens_tensor = self.tokenize_input(text)
---> 84         pooled, hidden_states = self.model(tokens_tensor)[-2:]
     85 
     86         if -1 > hidden > -12:

ValueError: not enough values to unpack (expected 2, got 1)
pratikghanwat7 commented 4 years ago

When I added one more parameter to d_moel, issue got resolved.

d_model = DistilBertModel.from_pretrained('distilbert-base-multilingual-cased',output_hidden_states=True)