ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.38k stars 3.61k forks source link

Add support for decoding input with ffmpeg (Linux) #2133

Closed WilliamTambellini closed 5 months ago

WilliamTambellini commented 5 months ago

WIP: for early review only. Do not merge.

WilliamTambellini commented 5 months ago

@ggerganov @slaren could you please early review before I move on ? Best, WT.

ggerganov commented 5 months ago

We can add the FindFFmpeg.cmake script in the cmake folder and use it to find ffmpeg libs

Probably the conversion functionality should be implemented in common.cpp so that it can be reused by all examples, not just main

WilliamTambellini commented 5 months ago

We can add the FindFFmpeg.cmake script in the cmake folder and use it to find ffmpeg libs

done

Probably the conversion functionality should be implemented in common.cpp so that it can be reused by all examples, not just main

ok

WilliamTambellini commented 5 months ago

@ggerganov reready for review, tks

WilliamTambellini commented 5 months ago

@petterreinholdtsen review please

WilliamTambellini commented 5 months ago

@arthw review please

WilliamTambellini commented 5 months ago

@ggerganov retouched. Reready for final review.

ggerganov commented 5 months ago

Hm, I think you didn't push the correct revision - I don't see any changes since last time

WilliamTambellini commented 5 months ago

oops indeed @ggerganov . Just pushed the latest retouches.

WilliamTambellini commented 5 months ago

tks @ggerganov Any way to do a new minor release soon (eg 1.6.1) ?

ggerganov commented 5 months ago

done

data-man commented 5 months ago

Unfortunately cannot be built with FFmpeg 7.0.

clort81 commented 4 months ago
1b51fdf170714dcdd8fb9cfd02dcee684aac6150 is the first bad commit
commit 1b51fdf170714dcdd8fb9cfd02dcee684aac6150
Author: William Tambellini <wtambellini@sdl.com>
Date:   Tue May 21 08:31:41 2024 -0700

    examples : add support for decoding input with ffmpeg (Linux) (#2133)
/pr/Neural/Voice_Recognition_Whispr_GGML/good-whisper.cpp/examples/ffmpeg-transcode.cpp: In function ‘int decode_audio(audio_buffer*, s16**, int*)’:
/pr/Neural/Voice_Recognition_Whispr_GGML/good-whisper.cpp/examples/ffmpeg-transcode.cpp:207:5: error: ‘av_register_all’ was not declared in this scope
  207 |     av_register_all(); // from avformat. Still a must-have call for ffmpeg v3! (can be skipped for later versions)
      |     ^~~~~~~~~~~~~~~

ffmpeg                                        7:5.1.4-0+deb12u1 

devuan linux

Displacer commented 3 months ago
1b51fdf170714dcdd8fb9cfd02dcee684aac6150 is the first bad commit
commit 1b51fdf170714dcdd8fb9cfd02dcee684aac6150
Author: William Tambellini <wtambellini@sdl.com>
Date:   Tue May 21 08:31:41 2024 -0700

    examples : add support for decoding input with ffmpeg (Linux) (#2133)
/pr/Neural/Voice_Recognition_Whispr_GGML/good-whisper.cpp/examples/ffmpeg-transcode.cpp: In function ‘int decode_audio(audio_buffer*, s16**, int*)’:
/pr/Neural/Voice_Recognition_Whispr_GGML/good-whisper.cpp/examples/ffmpeg-transcode.cpp:207:5: error: ‘av_register_all’ was not declared in this scope
  207 |     av_register_all(); // from avformat. Still a must-have call for ffmpeg v3! (can be skipped for later versions)
      |     ^~~~~~~~~~~~~~~

ffmpeg                                        7:5.1.4-0+deb12u1 

devuan linux

can probably be fixed with:

if LIBAVFORMAT_VERSION_MAJOR < 56

av_register_all(); // from avformat. Still a must-have call for ffmpeg v3! (can be skipped for later versions)

endif

Change 56 to correct ffmpeg version. media-video/ffmpeg-4.4.4 seems to have 56 version (but i am not sure)