code-kern-ai / bricks

Open-source natural language enrichments at your fingertips.
Apache License 2.0
448 stars 22 forks source link

[MODULE] - Text summarization #183

Open jhoetter opened 1 year ago

jhoetter commented 1 year ago

Please describe the module you would like to add to bricks Would be super helpful to summarize (e.g. in bullet points) the core messages of a longer text.

Do you already have an implementation? No, but i think hugging face could offer things for that. Both os, but also as endpoints

Additional context -

LeonardPuettmann commented 1 year ago

Huggingface has a summarization pipeline that's easy to use. I've already used it here and could implement this as a bricks module. https://github.com/LeonardPuettmann/text-summarizer/blob/master/summarizer.ipynb

divyanshukatiyar commented 1 year ago

On smaller levels, even spacy can provide good summarisations. I just built a function to generate the summary using spacy. It makes the use of heap queue algorithm (heaps are binary trees whose parent node has value less than or equal to any of its children... doesn't matter haha XD) -> we can discuss this together @LeonardPuettmann :)

SvenjaKern commented 1 year ago

I found a small issue: common: NameError: name 'smalltalk_truncation' is not defined (change to text_summarization)