manisnesan / fastchai

Repository capturing deep learning & nlp experiments using fastai & pytorch
Apache License 2.0
2 stars 0 forks source link

Meeting Summarization Use Case #76

Open manisnesan opened 7 months ago

manisnesan commented 7 months ago
          [From rasbt post](https://x.com/rasbt/status/1754516687896887449?s=46&t=aOEVGBVv9ICQLUYL4fQHlQ) - Flan T5 is a great go to model for text classification. 

Tiny titans - Can smaller LLM models punch above their weight for meeting summarization

Originally posted by @manisnesan in https://github.com/manisnesan/fastchai/issues/47#issuecomment-1928762586

Questions

manisnesan commented 7 months ago

Meeting Summarization

Meeting summarization is the process of creating a concise overview of the key points, decisions, and action items discussed during a meeting[1]. It serves to keep stakeholders informed, facilitate decision-making, encourage accountability, and enhance communication[1].

There are several proven ways to summarize a meeting effectively:

  1. Take concise notes during the meeting, focusing on the most important information[1].

  2. Use a clear and organized format in the summary, such as including the date, time, location, attendees, agenda items, discussion points, decisions, action items, and next steps[1].

  3. Follow and fill out the meeting agenda when creating the summary notes[1].

  4. Summarize the meeting over email to all participants after the fact[1].

  5. Use AI tools to automatically generate meeting summaries from transcripts[1][2].

Challenges in meeting summarization include the difficulty of collecting confidential meeting data, the labor-intensive process of annotating summaries, and the need to capture key issues while excluding irrelevant discussions[4][5]. Recent research has focused on creating benchmark datasets[3][4][5] and developing advanced summarization models[2][3].

In summary, meeting summarization is a crucial skill for keeping teams aligned and productive, with various manual and automated techniques available to create high-quality summaries efficiently.

Citations: [1] https://fireflies.ai/blog/summarize-a-meeting [2] https://github.com/topics/meeting-summarization [3] https://paperswithcode.com/task/meeting-summarization [4] https://arxiv.org/abs/2305.17529 [5] https://aclanthology.org/2023.acl-long.906.pdf

manisnesan commented 7 months ago

Diverse Summarization Dataset

From Pegasus - Paper

news_email_bills_science_tech

manisnesan commented 7 months ago

From Abstractive Meeting Summarization

Customer Service Calls could be multi-party conversation but only two party speak in a given time span. Also the format of the meeting in customer service is problem solving in nature.

Eg: Customer Rep - Agent 1 ---> Customer Rep - Agent 2 ----> Customer Rep -- Agent 3

Related: Abstractive Dialogue summarization, Abstractive Text Summarization, Meeting Summariziation, text Generation

Stages in abstractive

manisnesan commented 7 months ago

Differences from traditional summarization

manisnesan commented 7 months ago

From Call Summarization: why it is important and what it is possible today and in a near future

"AUTOMATIC SUMMARIZATION OF CALL-CENTER CONVERSATION" by E. Stepanov, B. Favre, F. Alam, S. Chowdhury, K. Singla, J. Trione, F. Be ́chet, G. Riccardi. offers a hybrid approach using both extractive/abstractive.

See

manisnesan commented 7 months ago

From Generating Abstractive Summaries from Meeting Transcripts

image

manisnesan commented 7 months ago

Challenges involved

Nature of meeting-style speech :

Preference for abstractive summarization

Heterogeneous meeting formats

Subjectivity

manisnesan commented 6 months ago

See the example case study from Orca paper on Meeting Transcript processing

Example from the paper

System

You are a teacher. Given a task, you explain in simple steps what the task is asking, any guidelines it provides, and how to use those guidelines it provides to find the answer.

User

You will read a meeting transcript, then extract the relevant segments to answer the following question

Question: How does Steven feel about selling?

$Meeting_Transcript

Please answer the following question Question: How does Steven feel about selling?

Extract from transcript the most relevant segments for the answer, then answer the question.

manisnesan commented 6 months ago

https://www.reddit.com/r/LocalLLaMA/s/xeSFTXwa5q

manisnesan commented 6 months ago

https://community.openai.com/t/how-to-summarize-large-research-articles/142730

manisnesan commented 6 months ago

Five levels of summarizing Youtube

Usecase

YouTube Videos - Auto Chapter Generation Podcasts - Extract structured information Meeting Notes - Send topic summaries to participants Town Hall Meetings - Structured information Earnings Report Calls - Sell structured data to investment groups Legal Documents - Quickly summarize by topic Movie Scripts - Quick bullet points for production recaps Books - Auto generate table of contents

manisnesan commented 4 days ago

PYDATA - NYC 2024 The Art of Compression: Crafting Insightful Summaries with LLMs

As Large Language Models continue to advance, their application in text summarization presents both powerful opportunities and specific challenges. This talk will focus on practical strategies to overcome the limitations posed by context windows—a critical factor when dealing with extensive texts. The talk will also demonstrate how fine-tuning can improve summarization tasks for domain specific private datasets and when to use what. Attendees will learn how to build an end-to-end summarization workflow, with a focus on effective data chunking, prompt optimization, and advanced evaluation methods to ensure accurate and meaningful summaries. The session will cover three key summarization techniques—stuff, refine, and map-reduce—explaining when and how to use each approach. In addition, we’ll explore the latest in evaluation metrics, demonstrating how to leverage more sophisticated models as judges to refine and assess the quality of summaries.

Outline:

Background Knowledge Required:

Presentation - https://github.com/aartij22/Pydata-NYC-2024