dmmiller612 / bert-extractive-summarizer

Easy to use extractive text summarization with BERT
MIT License
1.4k stars 308 forks source link

How do you calculate the number of cluster for KMeans? #53

Closed bignyap closed 4 years ago

bignyap commented 4 years ago

I can see that the number of cluster is defined as this: k = 1 if ratio len(self.features) < 1 else int(len(self.features) ratio)

Can anyone explain why? What is the role of ratio here?

dmmiller612 commented 4 years ago

The ratio is supposed to be the percentage of sentences from the body of text that you want to summarize to. The motivation came from the gensim summarization package, which also does ratios.