Marksdo / Whisper

Batch Local Transcribe Audio/Movie To Text With Whisper AI Model. Keep Privacy Safe!
https://whisper.marksdo.com
8 stars 1 forks source link

DeepGram shows black transcription for video longer than an hour. #12

Closed holanino closed 5 months ago

holanino commented 7 months ago

For some reason when selecting DeepGram Whisper it show a transcription when finished for videos longer than an hour. It does not work with Whisper either medium or large models, none of the models work.

Also it says "No Logs. Please Start Process Task First (Old Project Before V5.0 No Logs)"

I do not know what is going on. It does work with DeepGram Enhanced and with videos shorter than one hour.

(I cant add screenshots here for some reason, tried two browsers, GitHub having trouble?)

Marksdo commented 7 months ago

I just test an video longer than 1 hour, It can get the result from deepgram.

Can you send me the file that you can't transcribe with deepgram api? If it fail then i will debug to see what is happen. Cause i test all file before is always ok. The response is always success. So i need an example file that fail and i will get the error response to display.


"No Logs. Please Start Process Task First (Old Project Before V5.0 No Logs)" This log is show in using local whisper model process when use deepgram remote transcribe there is no log details

Marksdo commented 7 months ago

Test another 2 hours video. It also can be transcribe with deepgram. Can you see me the file you can't transcribe to test and let me track the issue?

image
holanino commented 7 months ago

This is a link to a video that will not work in Whisper Mate. The result is empty/blank

https://www.youtube.com/watch?v=DQhNHo6Ioa8

You can use this to download - https://ssyoutube.com/

Try and if not I can let you use my api key. I try in Whisper Mate 5.4.3 and .5.4.4 does not work.

I cut video in half and part 1 worked and part 2 worked also.

Marksdo commented 7 months ago

I use Whisper Mate embed download video plugin to download and transcribe with Deepgram is success.(Maybe you download with ssyoutube.com's file is something wrong?

image

BTW: Whisper Mate self has an plugin download video from website, just enable it in settings. It can direct download media from well-known video website.

image image
Marksdo commented 7 months ago

Preference > Plugin > Download Remember to set an download folder first. Cause Whisper Mate is run in Sandbox mode. It need you give permission first.

202311-22-2

For reply to upload image. I meet your previous issue, can't upload image to github with chrome. I switch to safari. it can upload now.

Marksdo commented 7 months ago

Tips: When enable the download plugin. You also can paste url direct to Whisper Mate when it is in front. No need to click the main toolbar add url button. 😀

holanino commented 7 months ago

Thank you very much for your time. Whisper Mate downloaded a video that just finished from a livestream and it would not transcribe.

I tried Audio WEBM, Audio MP4, Video MP4, none of them worked. I download with another app and converted to mp3 and that transcription worked.

For new livestream that just finished I will have to convert to MP3 first I guess.

I will make a new post as to why I am using DeelGram whisper API instead of local Whisper due to the difference in transcriptions.

Marksdo commented 6 months ago

Hi, sorry i just see your response. Can you send me the downloaded livestream file?

I found that some issue report to can't get audio from video ( So maybe some file like this, I think maybe livestream that make media file's timestamp is mismatch ), It can be auto fix it by WhisperMate. So i need an example to verify this is real fix. Can you send the url or file to me?

holanino commented 5 months ago

It is doing it again, but not sure why. When I upload a video to be transcribed by Deepgram it will not work. Videos over 15 minutes it does not work, if under 15 minutes, it works.

I paste video link in Whisper Mate and choose audio mp4, mp3 it does not work. Empty transcription.

Example Video:

https://www.youtube.com/watch?v=sv3-A1H-hes

Screenshot 2024-01-18 at 7 59 59 PM
holanino commented 5 months ago

Looks like Whisper Cloud has 20 minute processing limit now? Is this why the Whisper transcriptions are failing?

https://developers.deepgram.com/docs/deepgram-whisper-cloud

Also can get Nova 2 model.

Marksdo commented 5 months ago

According to the Deepgram document, it is likely that one file processing time is limited. You can try splitting the file into multiple projects(Use Whisper Mate Quick Cut) to see it's work?

https://developers.deepgram.com/docs/getting-started-with-pre-recorded-audio#maximum-processing-time

Maximum Processing Time

Nova, Base, and Enhanced provide extremely fast transcription. Deepgram limits the maximum processing time to 10 minutes for these models.

Whisper is much slower than the other models, and the maximum processing time is 20 minutes for Whisper.

If a request takes longer than the maximum processing time to complete, the request is cancelled and a 504: Gateway Timeout error is returned.

holanino commented 5 months ago

The reason I use DeepGram Whisper is because is does not group subtitles into paragraphs/groups of words as much as Local computer Whisper.

I wish Whaisper Mate local app would make subtitles based on time matching the timing spoken with the speaker.

Current Whisper Model medium and Large groups words into group of words and then there is empty/silence even though the speaker is still speaking.

There are no pauses to match the timing of the speaker.

On Fri, Jan 19, 2024, 8:04 PM 荒野求声 @.***> wrote:

According to the Deepgram document, it is likely that one file processing time is limited. You can try splitting the file into multiple projects(Use Whisper Mate Quick Cut) to see it's work?

https://developers.deepgram.com/docs/getting-started-with-pre-recorded-audio#maximum-processing-time Maximum Processing Time

Nova, Base, and Enhanced provide extremely fast transcription. Deepgram limits the maximum processing time to 10 minutes for these models.

Whisper is much slower than the other models, and the maximum processing time is 20 minutes for Whisper.

If a request takes longer than the maximum processing time to complete, the request is cancelled and a 504: Gateway Timeout error is returned.

— Reply to this email directly, view it on GitHub https://github.com/Marksdo/Whisper/issues/12#issuecomment-1901588211, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC3B5ZM32VC3J5SKVT6MALTYPMQZJAVCNFSM6AAAAAA7TMCBVCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBRGU4DQMRRGE . You are receiving this because you authored the thread.Message ID: @.***>

Marksdo commented 5 months ago

It seem that DeepGram has do some pre-processing of the audio

I think it may be first turn audio to chunks split by speaker then do the transcribe.

holanino commented 5 months ago

Can we get Nova2 as an option. The app only has Nova (1)

Marksdo commented 5 months ago

Thanks for info it, Will add to V5.5.0

holanino commented 5 months ago

Thank you