Groq API exact steps for google colab notebook

timtensor commented 5 months ago

Hi , what exactly are the steps for using the notebook for summarization ? I did have a valid Groq api key but it showed error message as invalid api key.

Also if we use transcription and the video is a non english language video , what should be modified ?

martinopiaggi commented 5 months ago

Hello, firstly, please be aware that it may take up to 10 minutes for a newly created Groq API key to become active. Anyway, make sure to select "groq" as the API endpoint. If you're using a YouTube video, the process involves:

Setup
Installation of necessary libraries
Video fetching (skips the transcription step with Whisper if you want to use YouTube captions)
Summarization

Regarding your second question about non English language video:

in case of YouTube captions the LLM will process captions that are auto-translated by Google and provide a summary in English.
in case you choose to use transcription with Whisper, ensure that during the setup, you deselect "use_youtube_captions" and input the correct language ID in the "language field" within the Transcription using Whisper section. To know which is your language id check the openai documentation about whisper (for example "italian" is "it" ) . In this case I'm not entirely certain whether the final summary will be in English or in the original video language (I always use the YouTube captions or working with English videos when using Whisper).

I hope this clarifies your doubts.

timtensor commented 5 months ago

thank you for the information. I will give it a try . Are google transcription better or better with whisper , i guess you are not sure of it right ?

But thanks a lot for explaining. I will try it out to get a gist of the summaries. edit : it seems to work well on transcription. Even though the transcription was in german , it automatically translated into english.

About whisper model for transcribing , could this be a good alternative ? https://github.com/huggingface/distil-whisper

martinopiaggi commented 5 months ago

No problem! You're welcome :)

Regarding Whisper vs youtube auto gen captions: Whisper is actually better for transcription. If you look at Whisper documentation, you’ll see that you can use more accurate variant like "large" or "medium" by simply changing a word in this python notebook. Just keep in mind that more accuracy = slower .

The alternative you're considering seems valid, and I think you can use them without changing too much code. At the start of this project, I was using "faster-whisper," which gives a lot of performance boost. However, I had to discard it because it's not compatible with the most recent CUDA API, which I have installed locally on my desktop.

timtensor commented 5 months ago

@martinopiaggi - sorry to ask again but i had another question. there are podcast(s) on youtube , for which this summarization is a very good use case. Some podcasts have show notes , or chapters listed out . Wouldnt that be possible as well to get like chapter like summaries. I am wondering if it is already taken care like that or i think currently it is chunking with size 4096

martinopiaggi commented 5 months ago

Currently, it's chunking at 4096, yes. Honestly, I'm satisfied with this approach, which is general: it works with any kind of video. Your use case is specific (if I correctly understood, you want the final summary to follow the structure of the chapters that are in the video's description) and can present multiple challenges.

timtensor commented 4 months ago

Yes in a nutshell yes . I havent experimented much so I am not sure if it is better . But considering it's like a logical stop between contexts it could be helpful .

I am curious about this chunking ,so basically are you making a small RAG system ? If so this concept or your implementation could be used for Q/ A systems right ?

martinopiaggi / summarize

Groq API exact steps for google colab notebook #2