ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.7k stars 3.63k forks source link

Whisper won't build on MinGW without libmingw32_extended #746

Closed bitwisexornot closed 1 year ago

bitwisexornot commented 1 year ago

I would guess due to changes recently (mmap?), there is/are some new function calls that are not covered under vanilla MinGW x64. I don't know how VC builds are affected or not.

To resolve the build error I compiled and installed this repo: https://github.com/CoderRC/libmingw32_extended

Then added the line -lmingw32_extended to the Makefile, and the issue was resolved.

This was not required when I compiled master sometime last month. Sorry I don't have the build log, but wished to comment on the new build issue and the solution I found.

From the repo description: POSIX functions that are included in this repository are pipe, wait, mmap, munmap, msync, mlock, munlock, posix_madvise, madvise, shm_open, shm_unlink, readv, writev, process_vm_readv, process_vm_writev, dlopen, etc.

ggerganov commented 1 year ago

Is this still present on latest master? What is the error?

bitwisexornot commented 1 year ago

As of commit 0a2d1210bcb98978214bbf4e100922a413afd39d

This was my error, involving mlock/munlock

I whisper.cpp build info: I UNAME_S: MINGW64_NT-10.0-19044 I UNAME_P: unknown I UNAME_M: x86_64 I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -mfma -mf16c -mavx -mavx2 I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC I LDFLAGS: I CC: cc.exe (Rev10, Built by MSYS2 project) 12.2.0 I CXX: g++.exe (Rev10, Built by MSYS2 project) 12.2.0

cc -I. -O3 -DNDEBUG -std=c11 -fPIC -mfma -mf16c -mavx -mavx2 -c ggml.c -o ggml.o g++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -c whisper.cpp -o whisper.o g++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC examples/main/main.cpp examples/common.cpp ggml.o whisper.o -o main C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ggml.o:ggml.c:(.te xt+0x9950): undefined reference to munlock' C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ggml.o:ggml.c:(.te xt+0x9a20): undefined reference tomlock' collect2.exe: error: ld returned 1 exit status make: *** [Makefile:217: main] Error 1

Current master builds without a problem, no extra library needed! Thanks for all your hard work.

BTW, I have a server without AVX2 so I'm hoping the old AVX instruction set is getting utilized optimally (by a contributor, you have more pressing objectives). I don't know how to check that code myself. Whisper runs slow on a 40 core E5-2790v2 setup in comparison with a 8 core Kaby Lake i7-7700 with AVX2. Should I post benchmarks to a new ticket? This affects the llama.cpp project too, so I'd like to give appropriate feedback/request somewhere on AVX optimization verification. Again, thanks, Whisper has become a great tool for me for transcribing lectures and memos, regularly.

ggerganov commented 1 year ago

I believe that beyond 8 CPU threads, on most machines, the computation becomes memory bound, so it does not matter if you have 16 or 40 cores. On that note, there are some upcoming ideas for reducing the memory pressure and additionally, using 4-bit quantized Whisper models, but it is not clear yet what would be the benefits if any.