Request: CUDA feature extraction for Windows

HunterAP23 commented 1 year ago

Referencing #1152 which added the ability to use Nvidia CUDA acceleration for calculating VMAF and related scores, the existing pipeline does not seem possible to be able to compile a version of libvmaf with CUDA enabled for Windows.

Based on the Windows build pipeline used here, it uses MING64 with MYSYS2, which does not have a way to install the necessary CUDA dependencies (MYSYS2 uses the pacman package manager tool, which Nvidia does not have official instructions on installing CUDA for, and the MYSYS2 package repositories do not include CUDA).

Would it also be possible to include builds with CUDA enabled in the existing pipelines? I don't believe there should be any issues in terms of licensing, but someone more knowledgeable than me on that topic can confirm if this is correct.

gedoensmax commented 1 year ago

I can tell you from NV side we did not test under windows and did not really targeted it. Have you considered using WSL with CUDA support ? https://docs.nvidia.com/cuda/wsl-user-guide/index.html

HunterAP23 commented 1 year ago

I should've mentioned that I've also tried WSL2 with the correct WSL2 Ubuntu CUDA toolkit installation. These are the steps I've followed until I encounter an error:

git clone https://github.com/Netflix/vmaf
cd vmaf/libvmaf

# instructions from the libvmaf README
python3 -m pip install virtualenv
python3 -m virtualenv .venv
source .venv/bin/activate
pip install meson
sudo apt install nasm ninja-build doxygen xxd

# Install the CUDA toolkit, including NVCC
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
sudo apt install cuda nvidia-cuda-toolkit nvcc

# Edited the libvmaf/meson_options.txt file to have "enable_cuda" set to "true"

# Go through with build instructions as per libvmaf README
meson build --buildtype release
ninja -vC build

After that last step, I get an error saying:

FAILED: src/libcuda_common_vmaf_lib.a.p/cuda_integer_adm_adm_cm.cu.o
nvcc -Isrc/libcuda_common_vmaf_lib.a.p -Xcompiler=-Wall,-Winvalid-pch,-Wextra -O3 -Xcompiler=-fPIC -Isrc -I/usr/local/cuda/include -I/usr/local/cuda/include -I../src/cuda -I../src/feature -I../src/cuda -I../src/feature/common -I../src -Isrc -I../src/feature/common -I../src/feature -I../src -Isrc -I../include -Iinclude -I../src -Isrc -lineinfo -I../src -Isrc -Isrc/libcuda_common_vmaf_lib.a.p -o src/libcuda_common_vmaf_lib.a.p/cuda_integer_adm_adm_cm.cu.o -c ../src/cuda/integer_adm/adm_cm.cu
../src/cuda/integer_adm/adm_cm.cu(52): warning #68-D: integer conversion resulted in a change of sign

../src/cuda/integer_adm/adm_cm.cu(196): warning #186-D: pointless comparison of unsigned integer with zero

../src/cuda/integer_adm/adm_cm.cu(52): warning #68-D: integer conversion resulted in a change of sign

../src/cuda/integer_adm/adm_cm.cu(196): warning #186-D: pointless comparison of unsigned integer with zero

/usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’:
  435 |         function(_Functor&& __f)
      |                                                                                                                                                 ^
/usr/include/c++/11/bits/std_function.h:435:145: note:         ‘_ArgTypes’
/usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
  530 |         operator=(_Functor&& __f)

Trying to find any other error messages but this is the only one I could find

gedoensmax commented 1 year ago

For me compilation with WSL worked apart from a library issue that was solved by creating a symlink as explained here - haven't had that before with WSL not sure what is going on there. Nonetheless i would recommend only doing sudo apt install cuda specifying nvcc seems excessive. After installing CUDA you might want to add /usr/local/cuda/bin to your PATH.

HunterAP23 commented 1 year ago

So I added /usr/local/cuda/bin to my PATH and restarted WSL, as well as creating the symlink sudo ln -s /usr/lib/wsl/lib/libcuda.so.1 /usr/local/cuda/lib64/libcuda.so, but the same error appeared.

I then uninstalled nvidia-cuda-toolkit with sudo apt remove nvidia-cuda-toolkit and did a meson setup --wipe build just to clean up anything from prior build attempts.

The build worked! But afterwards when I went to run ninja -vC build test it fails for gpumask features:

ninja: Entering directory `build'
[1/16] /usr/bin/meson --internal vcstagger ../include/vcs_version.h.in include/vcs_version.h 2.3.1 /mnt/d/Documents/Code_Projects/vmaf/libvmaf/include @VCS_TAG@ '(.*)' /usr/bin/git --git-dir /mnt/d/Documents/Code_Projects/vmaf/libvmaf/../.git describe --tags --long --match '?.*.*' --always
[1/2] /usr/bin/meson test --no-rebuild --print-errorlogs
 1/16 test_vmaf_cuda_gpumask              FAIL            0.10s   exit status 2
>>> MALLOC_PERTURB_=231 /mnt/d/Documents/Code_Projects/vmaf/libvmaf/tools/test/test_vmaf_cuda_gpumask.sh

stderr:
/bin/sh: 0: Illegal option -

... # skipping over passed tests

Summary of Failures:

 1/16 test_vmaf_cuda_gpumask      FAIL            0.10s   exit status 2

That just seems to be an issue with the test file vmaf\libvmaf\tools\testtest_vmaf_cuda_gpumask.sh:

#!/bin/sh -x
set -e

# no gpumask: use cuda
./tools/vmaf \
    --reference /dev/zero \
    --distorted /dev/zero \
    --width 1920 --height 1080 --pixel_format 420 --bitdepth 8 \
    --frame_cnt 2 \
    --gpumask 0

# gpumask: use cpu
./tools/vmaf \
    --reference /dev/zero \
    --distorted /dev/zero \
    --width 1920 --height 1080 --pixel_format 420 --bitdepth 8 \
    --frame_cnt 2 \
    --gpumask -1

# no gpumask: use cuda for vmaf features, cpu for psnr
./tools/vmaf \
    --reference /dev/zero \
    --distorted /dev/zero \
    --width 1920 --height 1080 --pixel_format 420 --bitdepth 8 \
    --frame_cnt 2 \
    --gpumask 0 \
    --feature psnr \
    --output /dev/stdout

# gpumask: use cpu for vmaf features and psnr
./tools/vmaf \
    --reference /dev/zero \
    --distorted /dev/zero \
    --width 1920 --height 1080 --pixel_format 420 --bitdepth 8 \
    --frame_cnt 2 \
    --gpumask -1 \
    --feature psnr

Although I'm not sure if the error is from #!/bin/sh -x or set -e or some other line, as the error message just says it's from line 0 so I assume it's the #!/bin/sh -x?

Regardless of the test script, the actual commands used for testing do work properly, and they do work with some test video files I have (getting >200fps on some y4m files) but now I'm not sure how to make this build for Windows, as the built vmaf binary is for Linux.

1480c1 commented 1 year ago

steps I've tried so far

Invoke-RestMethod -Uri https://aka.ms/vs/17/release/vs_buildtools.exe -OutFile vs_buildtools.exe
.\vs_buildtools.exe
Install "MSVC v143 - VS 2022 C++ x64/x86 build tools (Latest)", "C++ Build Tools core features", and "Windows 1X SDK"
download & install the cuda sdk https://developer.nvidia.com/cuda-downloads
clone vmaf
& "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\Common7\Tools\Launch-VsDevShell.ps1" # probably $env:CUDA_PATH/bin is already in your path if you have restarted
$env:CC="ccache gcc"; $env:CXX="ccache g++"
meson -Denable_cuda=true build .\libvmaf\

Arrive at


Run-time dependency CUDA (modules: cudart) found: YES 12.0 (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.0)
libvmaf/src/meson.build:166: WARNING: add_languages is missing native:, assuming languages are wanted for both host and build.
Compiler for language cuda for the build machine not found.

libvmaf/src/meson.build:166:4: ERROR: Compiler nvcc can not compile programs. ... Sanity testing Cuda compiler: nvcc Is cross compiler: False. Sanity check compiler command line: nvcc -w -cudart static C:/Users/cddeg/vmaf/bui/meson-private/sanitycheckcuda.cu -o C:/Users/cddeg/vmaf/bui/meson-private/sanitycheckcuda.exe Sanity check compile stdout: C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.34.31933\include\vcruntime.h(197): error: invalid redeclaration of type name "size_t" C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.34.31933\include\vcruntime_new.h(48): error: first parameter of allocation function must be of type "size_t" C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.34.31933\include\vcruntime_new.h(53): error: first parameter of allocation function must be of type "size_t" ...


and give up after finding out that cuda sdk's nvcc doesn't understand gcc/g++ on Windows at all.

I already tried seeing if I could get meson to use clang as the cuda compiler  (since ffmpeg was able to), but couldn't find out how.

kylophone commented 1 year ago

I've been thinking about this, it may make sense to eventually migrate to using nv-codec-headers. This gives us runtime dynamic linking, and should also simplify multi-platform building. This is what is used by FFmpeg.

HunterAP23 commented 1 year ago

I second the notion of migrating to the nv-codec-headers.

And speaking of FFmpeg, I'm reminded of this other project called media-autobuild_suite that allows Windows users to cross compile FFmpeg in a MINGW64 MYSYS2 environment, and the interesting part is that they are able to do so with CUDA support. I'll investigate into their build scripts and see if it's possible to translate what they do to get CUDA working into VMAF directly.

kylophone commented 1 year ago

And speaking of FFmpeg, I'm reminded of this other project called media-autobuild_suite that allows Windows users to cross compile FFmpeg in a MINGW64 MYSYS2 environment, and the interesting part is that they are able to do so with CUDA support. I'll investigate into their build scripts and see if it's possible to translate what they do to get CUDA working into VMAF directly.

That would be great, thank you.

HunterAP23 commented 1 year ago

So it seems that the media-autobuild_suite project gets CUDA and NVCC working like so:

User has to download the Windows CUDA SDK. They outline what's necessary in this part of their README
They utilize a program within MING64 MSYS2 called cygpath to get a Linux path to the CUDA_PATH environment variable (that should be set in Windows after installing the Windows CUDA SDK). This part of their script utilizes this command to get the CUDA_PATH env var that contains all the dependencies.
Go through with the build using nvcc.exe

HunterAP23 commented 1 year ago

It's been quite a while since I provided an update. I changed the steps used in trying to compile VMAF for use in Windows by moving to using the ffmpeg-windows-build-helpers app to cross-compile FFmpeg with libvmaf support.

The issue comes down to getting CUDA to register correctly (at least on Ubuntu-WSL). These are the steps I've been following:

# Download CUDA 11.8 since CUDA 12 does not seem to work
# Instructions come from https://developer.nvidia.com/cuda-11-8-0-download-archive
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
sudo apt install -y nvidia-cuda-toolkit

# Clone the builder app
git clone https://github.com/rdp/ffmpeg-windows-build-helpers

# install dependencies for this app
sudo apt-get install subversion ragel curl texinfo g++ ed bison flex cvs yasm automake libtool autoconf gcc cmake git make pkg-config zlib1g-dev unzip pax nasm gperf autogen bzip2 autoconf-archive p7zip-full meson clang python3-distutils python-is-python3 -y
# apply a fix for Ubuntu 20.04 for WSL: https://github.com/rdp/ffmpeg-windows-build-helpers/issues/452
sudo dpkg -r --force-depends "libgc1c2"
git clone https://github.com/ivmai/bdwgc
cd bdwgc
./autogen.sh
./configure --prefix=/usr && make -j
sudo make install

# Make a copy of the cross_compile_ffmpeg.sh file for modification
cd ffmpeg-windows-build-helpers
cp cross_compile_ffmpeg.sh cross_compile_ffmpeg_libvmaf_cuda.sh

Next I do some hacky edits to the cross_compile_ffmpeg_libvmaf_cuda.sh file. Here is a git diff between the old and modified versions:

1065c1065
<   do_git_checkout https://github.com/Netflix/vmaf.git vmaf_git v2.3.0
---
>   do_git_checkout https://github.com/Netflix/vmaf.git vmaf_git
1072c1072
<     local meson_options="--prefix=${mingw_w64_x86_64_prefix} --libdir=${mingw_w64_x86_64_prefix}/lib --buildtype=release --default-library=static . build"
---
>     local meson_options="--prefix=${mingw_w64_x86_64_prefix} --libdir=${mingw_w64_x86_64_prefix}/lib --buildtype=release -Denable_cuda=true --default-library=static . build"
2373c2373
<     config_options="$init_options --enable-libcaca --enable-gray --enable-libtesseract --enable-fontconfig --enable-gmp --enable-libass --enable-libbluray --enable-libbs2b --enable-libflite --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopus --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libvorbis --enable-libwebp --enable-libzimg --enable-libzvbi --enable-libmysofa --enable-libopenjpeg  --enable-libopenh264  --enable-libvmaf --enable-libsrt --enable-libxml2 --enable-opengl --enable-libdav1d --enable-cuda-llvm  --enable-gnutls"
---
>     config_options="$init_options --enable-libcaca --enable-gray --enable-libtesseract --enable-fontconfig --enable-gmp --enable-libass --enable-libbluray --enable-libbs2b --enable-libflite --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopus --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libvorbis --enable-libwebp --enable-libzimg --enable-libzvbi --enable-libmysofa --enable-libopenjpeg  --enable-libopenh264  --enable-nonfree --enable-ffnvcodec --enable-libvmaf --enable-libsrt --enable-libxml2 --enable-opengl --enable-libdav1d --enable-cuda-llvm  --enable-gnutls"

Essentially I'm telling the script to:

Pull the latest version of the vmaf's master branch rather than a specific release
Use the -Denable_cuda=true flag for building libvmaf
Added in --enable-nonfree --enable-ffnvcodec to the FFmpeg configure command

Then just run the modified cross compiler script: cross_compile_ffmpeg_libvmaf_cuda.sh and when prompted enter 3 for a Win64 build

The script typically fails with VMAF due to meson not finding the cudart_static package. I'm not sure what environment variable or otherwise may be missing that would cause this, but doing the meson setup command like so fixes the issue:

export MING_PREFIX="/home/hunterap/ffmpeg-windows-build-helpers/sandbox/cross_compilers/mingw-w64-x86_64/x86_64-w64-mingw32"
meson setup --reconfigure --prefix=${MING_PREFIX} --libdir=${MING_PREFIX}/lib --buildtype=release -Denable_cuda=true --default-library=static . build --cross-file=meson-cross.mingw.txt

This works, but trying the compiler script again or just using the regular command of ninja -C build install (which the compiler script also does with some extra environment variables) still fails. This time it's due to errors in the attached log file: build.log

This was with CUDA 11.8 so it could be that this version is either too new or too old to work with VMAF, I'll try with other CUDA versions and report back.

peterlus commented 4 months ago

It's been quite a while since I provided an update. I changed the steps used in trying to compile VMAF for use in Windows by moving to using the ffmpeg-windows-build-helpers app to cross-compile FFmpeg with libvmaf support.

The issue comes down to getting CUDA to register correctly (at least on Ubuntu-WSL). These are the steps I've been following:

# Download CUDA 11.8 since CUDA 12 does not seem to work
# Instructions come from https://developer.nvidia.com/cuda-11-8-0-download-archive
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
sudo apt install -y nvidia-cuda-toolkit

# Clone the builder app
git clone https://github.com/rdp/ffmpeg-windows-build-helpers

# install dependencies for this app
sudo apt-get install subversion ragel curl texinfo g++ ed bison flex cvs yasm automake libtool autoconf gcc cmake git make pkg-config zlib1g-dev unzip pax nasm gperf autogen bzip2 autoconf-archive p7zip-full meson clang python3-distutils python-is-python3 -y
# apply a fix for Ubuntu 20.04 for WSL: https://github.com/rdp/ffmpeg-windows-build-helpers/issues/452
sudo dpkg -r --force-depends "libgc1c2"
git clone https://github.com/ivmai/bdwgc
cd bdwgc
./autogen.sh
./configure --prefix=/usr && make -j
sudo make install

# Make a copy of the cross_compile_ffmpeg.sh file for modification
cd ffmpeg-windows-build-helpers
cp cross_compile_ffmpeg.sh cross_compile_ffmpeg_libvmaf_cuda.sh

Next I do some hacky edits to the cross_compile_ffmpeg_libvmaf_cuda.sh file. Here is a git diff between the old and modified versions:

1065c1065
<   do_git_checkout https://github.com/Netflix/vmaf.git vmaf_git v2.3.0
---
>   do_git_checkout https://github.com/Netflix/vmaf.git vmaf_git
1072c1072
<     local meson_options="--prefix=${mingw_w64_x86_64_prefix} --libdir=${mingw_w64_x86_64_prefix}/lib --buildtype=release --default-library=static . build"
---
>     local meson_options="--prefix=${mingw_w64_x86_64_prefix} --libdir=${mingw_w64_x86_64_prefix}/lib --buildtype=release -Denable_cuda=true --default-library=static . build"
2373c2373
<     config_options="$init_options --enable-libcaca --enable-gray --enable-libtesseract --enable-fontconfig --enable-gmp --enable-libass --enable-libbluray --enable-libbs2b --enable-libflite --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopus --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libvorbis --enable-libwebp --enable-libzimg --enable-libzvbi --enable-libmysofa --enable-libopenjpeg  --enable-libopenh264  --enable-libvmaf --enable-libsrt --enable-libxml2 --enable-opengl --enable-libdav1d --enable-cuda-llvm  --enable-gnutls"
---
>     config_options="$init_options --enable-libcaca --enable-gray --enable-libtesseract --enable-fontconfig --enable-gmp --enable-libass --enable-libbluray --enable-libbs2b --enable-libflite --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopus --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libvorbis --enable-libwebp --enable-libzimg --enable-libzvbi --enable-libmysofa --enable-libopenjpeg  --enable-libopenh264  --enable-nonfree --enable-ffnvcodec --enable-libvmaf --enable-libsrt --enable-libxml2 --enable-opengl --enable-libdav1d --enable-cuda-llvm  --enable-gnutls"

Essentially I'm telling the script to:

Pull the latest version of the vmaf's master branch rather than a specific release
Use the -Denable_cuda=true flag for building libvmaf
Added in --enable-nonfree --enable-ffnvcodec to the FFmpeg configure command

Then just run the modified cross compiler script: cross_compile_ffmpeg_libvmaf_cuda.sh and when prompted enter 3 for a Win64 build

The script typically fails with VMAF due to meson not finding the cudart_static package. I'm not sure what environment variable or otherwise may be missing that would cause this, but doing the meson setup command like so fixes the issue:

export MING_PREFIX="/home/hunterap/ffmpeg-windows-build-helpers/sandbox/cross_compilers/mingw-w64-x86_64/x86_64-w64-mingw32"
meson setup --reconfigure --prefix=${MING_PREFIX} --libdir=${MING_PREFIX}/lib --buildtype=release -Denable_cuda=true --default-library=static . build --cross-file=meson-cross.mingw.txt

This works, but trying the compiler script again or just using the regular command of ninja -C build install (which the compiler script also does with some extra environment variables) still fails. This time it's due to errors in the attached log file: build.log

This was with CUDA 11.8 so it could be that this version is either too new or too old to work with VMAF, I'll try with other CUDA versions and report back.

have you ever solved the issues and finish built vmaf-cuda windows version?

Netflix / vmaf

Request: CUDA feature extraction for Windows #1154