dmmiller612 / bert-extractive-summarizer

Easy to use extractive text summarization with BERT
MIT License
1.39k stars 307 forks source link

Empty summary on Chinese texts #108

Closed tuzcsap closed 3 years ago

tuzcsap commented 3 years ago

Hello! I'm trying to apply this model to Chinese texts. I followed steps from issue:

from spacy.lang.zh import Chinese
def __init__(self, language=Chinese):
custom_config = AutoConfig.from_pretrained('bert-base-chinese')

custom_config.output_hidden_states=True

custom_tokenizer = AutoTokenizer.from_pretrained('bert-base-chinese')

custom_model = AutoModel.from_pretrained('bert-base-chinese', config=custom_config)

but model generates empty string as output.

Could you, please, tell me, which steps did I miss?

Daibao1120 commented 3 years ago

This might help. https://geek.digiasset.org/pages/nlp/nlpinfo/bert-text-summarizer-chinese/

tuzcsap commented 3 years ago

@Daibao1120 Thank your for this link! The code worked fine in colab environment yet one additional precondition was required: pip install sentencepiece