forge adaptation - Githubissues

with the new Forge,

driving_video_pickle_input = gr.File(type="file", file_types=[".pkl"])

change to driving_video_pickle_input = gr.File(file_types=[".pkl"])

thanks in advance!

Hi,

Yes, I guess it's because Forge updated its gradio version to 4.x while SD WebUI is still in 3.x. I have to check this carefully because:

pickle input will give a file with a .name property in 3.x while for 4.x we have the file name directly,
the extension must handle both major gradio versions to be compatible between Forge and SD WebUI.

I'll see next week how to do this.

Hi @CoqueTornado,

I've updated the extension code to enable compatibility with Forge by fixing issues with Pydantic and Gradio. You should be able to get these modifications by going to your "Extensions" tab and clicking on "Check for updates" and then "Apply and restart UI".

Is it working for you in Forge now?

I have installed this in my Forge installation and it is working. Thank you for bringing this tool to Forge!

I have installed this in my Forge installation and it is working. Thank you for bringing this tool to Forge!

SD forge is currently running on CUDA 12.1

Can you pls list the steps, I have been trying to install it for the past 16 hours. I installed Nvidia latest driver with Cuda 12.6.65 and Cuda toolkit 12.6.2, installed VS Build tools both 2019 and 2022, still I get the error.

"Building of OP file for XPose has failed. Check the log file in the extension's 'logs' folder for more information."

Should I install the same CUDA version as forge?. Kindly explain the procedure. Though Live Portrait works for Human heads.

Should I install the same CUDA version as forge?

Yes, absolutely, otherwise it will not work (it's only the major version which is important though, either 11 or 12). If you look at the sd-forge\extensions\sd-webui-live-portrait\logs\xpose.err.log file, you should have an error message saying this (otherwise if you can post here the content of this file, it could help me).

Your CUDA_PATH environment variable should be something like C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6 if your Forge CUDA version is v12.6 (see here if you don't know how to change environment variables). You can check that the version of CUDA is correct by launching a cmd window (type Window+R and then cmd) and running the nvcc --version command. You should see something like this:

Then, close and restart SD Forge for it to take this new environment variable into account (don't forget this as otherwise SD Forge will still use the old environment variable value if you changed it) and then, in the Live Portrait Animals tab, click on the "Reinstall XPose and Restart WebUI" button.

Please, tell me if this solves your installation issue.

I used nvidia cleanup tool and cleared all the nvidia drivers and toolkits, then did a fresh install of Cuda 12.1.0 toolkit with display driver.

cuda

Visual Studio still has 2019 and 2022 build tools.

this is the xpose.log file, ( xpose.err.log is empty )

[vcvarsall.bat] Environment initialized for: 'x86_x64' running build running build_py creating build creating build\lib.win-amd64-3.10 creating build\lib.win-amd64-3.10\functions copying functions\ms_deform_attn_func.py -> build\lib.win-amd64-3.10\functions copying functions__init.py -> build\lib.win-amd64-3.10\functions creating build\lib.win-amd64-3.10\modules copying modules\ms_deform_attn.py -> build\lib.win-amd64-3.10\modules copying modules\ms_deform_attn_key_aware.py -> build\lib.win-amd64-3.10\modules copying modules__init__.py -> build\lib.win-amd64-3.10\modules running build_ext building 'MultiScaleDeformableAttention' extension creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10 creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476 creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cpu creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\utils\cpp_extension.py:1967: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation. If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST']. warnings.warn( Emitting ninja build file C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc -DWITH_CUDA -IC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\torch\csrc\api\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\TH" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\THC" "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cpu\ms_deform_attn_cpu.cpp /FoC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cpu\ms_deform_attn_cpu.obj -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17 [2/3] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc -DWITH_CUDA -IC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\torch\csrc\api\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\TH" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\THC" "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\vision.cpp /FoC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\vision.obj -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17 FAILED: C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/build/temp.win-amd64-3.10/Release/Users/Creative/AppData/Local/Temp/tmp66gfm476/src/vision.obj cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc -DWITH_CUDA -IC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\torch\csrc\api\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\TH" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\THC" "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\vision.cpp /FoC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\vision.obj -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17 C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\torch/csrc/python_headers.h(12): fatal error C1083: Cannot open include file: 'Python.h': No such file or directory [3/3] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc --generate-dependencies-with-compile --dependency-output C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -IC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\torch\csrc\api\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\TH" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\THC" "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu -o C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.obj -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(261): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_im2col_gpu_kernel(int, const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t ) [with scalar_t=double]" at line 946 instantiation of "void ms_deformable_im2col_cuda(cudaStream_t, const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t ) [with scalar_t=double]" at line 64 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

Remark: The warnings can be suppressed with "-diag-suppress "

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(261): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_im2col_gpu_kernel(int, const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t ) [with scalar_t=float]" at line 946 instantiation of "void ms_deformable_im2col_cuda(cudaStream_t, const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t ) [with scalar_t=float]" at line 64 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(762): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v2_multi_blocks(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 1001 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(872): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_gm(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 1024 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=1U]" at line 1050 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=2U]" at line 1072 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=4U]" at line 1094 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=8U]" at line 1116 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=16U]" at line 1138 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=32U]" at line 1160 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=64U]" at line 1182 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=128U]" at line 1204 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=256U]" at line 1226 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=512U]" at line 1248 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=1024U]" at line 1270 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(544): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v1(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 1294 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(649): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v2(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 1317 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(762): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v2_multi_blocks(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 1001 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(872): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_gm(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 1024 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=1U]" at line 1050 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=2U]" at line 1072 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=4U]" at line 1094 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=8U]" at line 1116 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=16U]" at line 1138 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=32U]" at line 1160 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=64U]" at line 1182 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=128U]" at line 1204 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=256U]" at line 1226 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=512U]" at line 1248 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=1024U]" at line 1270 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(544): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v1(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 1294 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(649): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v2(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 1317 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu

ms_deform_attn_cuda.cu tmpxft_0000219c_00000000-7_ms_deform_attn_cuda.cudafe1.cpp ninja: build stopped: subcommand failed. Traceback (most recent call last): File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2107, in _run_ninja_build subprocess.run( File "subprocess.py", line 526, in run subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\setup.py", line 64, in setup( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\setuptools__init__.py", line 104, in setup return distutils.core.setup(**attrs) File "distutils\core.py", line 148, in setup File "distutils\dist.py", line 966, in run_commands File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\setuptools\dist.py", line 967, in run_command super().run_command(command) File "distutils\dist.py", line 985, in run_command File "distutils\command\build.py", line 135, in run File "distutils\cmd.py", line 313, in run_command File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\setuptools\dist.py", line 967, in run_command super().run_command(command) File "distutils\dist.py", line 985, in run_command File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\setuptools\command\build_ext.py", line 91, in run _build_ext.run(self) File "distutils\command\build_ext.py", line 340, in run File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\utils\cpp_extension.py", line 870, in build_extensions build_ext.build_extensions(self) File "distutils\command\build_ext.py", line 449, in build_extensions File "distutils\command\build_ext.py", line 474, in _build_extensions_serial File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\setuptools\command\build_ext.py", line 252, in build_extension _build_ext.build_extension(self, ext) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\Cython\Distutils\build_ext.py", line 135, in build_extension super(build_ext, self).build_extension(ext) File "distutils\command\build_ext.py", line 529, in build_extension File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\utils\cpp_extension.py", line 842, in win_wrap_ninja_compile _write_ninja_file_and_compile_objects( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\utils\cpp_extension.py", line 1783, in _write_ninja_file_and_compile_objects _run_ninja_build( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2123, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension

I am running SD forge through Stability Matrix

Ah ok, I see, it's linked to StabilityMatrix then, which is not coming with the necessary Python files to install XPose (the AI model used for animal live portrait).

I managed to make it work by following this looooooooong procedure! If you're ready to continue, so am I :).

Installation procedure for StabilityMatrix

Close StabilityMatrix
Uninstall all versions of Visual Studio Build Tools 2022 and install Visual Studio Build Tools 2019 only (this step is really important otherwise it may not work).
Install Python 3.10.11 using installer at the bottom of the page : https://www.python.org/downloads/release/python-31011/ (see "Windows installer (64-bit)" under "Files")
Once installed, open a new cmd window (type Window+R then cmd) and verify that Python is well installed by running the python --version command which should display "Python 3.10.11"
Make sure CUDA 12.x is well installed using the nvcc --version command as discussed previously
Create a C:\Temp folder
Copy the content of the C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\stable-diffusion-webui-forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\ops folder in C:\Temp (you should have C:\Temp\setup.py, C:\Temp\test.py, some folders, etc.)
In your cmd window (reopen it if necessary), type the following commands (each line corresponds to one command, type Enter key after each):
```
cd C:\Temp
python -m venv venv
venv\Scripts\activate
```
After this last command, a (venv) should have appeared on the left of the command line. Continue with:
```
pip install torch==2.3.1 torchvision==0.18.1 --extra-index-url https://download.pytorch.org/whl/cu121
```
This will download pytorch (2.4 GB) so it may take a while. After a successfull install, run:
```
python setup.py build
```
This will build the necessary file that StabilityMatrix was not able to build. If this command is successful, you should have a new file under a folder such as C:\Temp\build\lib.win-amd64-cpython-310\MultiScaleDeformableAttention.cp310-win_amd64.pyd (the name will depend on your machine configuration but the extension of this file should be .pyd).

If you managed to create this file, then you're almost finished. You must copy this file back in the C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\stable-diffusion-webui-forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\ops\lib folder (create the lib folder if it does not exist).

You can finally relaunch StabilityMatrix and verify that the animal tab is not empty anymore. If all went well, you can safely close your cmd window and remove the C:\Temp folder.

I hope this procedure will work for you!

I tried every step closely, still not compiling the .pyd file.

installed - Python 3.10.11, Cuda 12.1

xpose1

installed torch

xpose2

only 2019 build tool has been installed.

xpose3

when I run the build final step to build the pyd,

it ends with error like this

(venv) c:\temp>python setup.py build running build running build_py running build_ext c:\temp\venv\lib\site-packages\torch\utils\cpp_extension.py:384: UserWarning: Error checking compiler version for cl: [WinError 2] The system cannot find the file specified warnings.warn(f'Error checking compiler version for {compiler}: {error}') building 'MultiScaleDeformableAttention' extension c:\temp\venv\lib\site-packages\torch\utils\cpp_extension.py:1967: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation. If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST']. warnings.warn( Emitting ninja build file c:\temp\build\temp.win-amd64-cpython-310\Release\build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/1] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc --generate-dependencies-with-compile --dependency-output c:\temp\build\temp.win-amd64-cpython-310\Release\temp\src\cuda\ms_deform_attn_cuda.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -Ic:\temp\src -Ic:\temp\venv\lib\site-packages\torch\include -Ic:\temp\venv\lib\site-packages\torch\include\torch\csrc\api\include -Ic:\temp\venv\lib\site-packages\torch\include\TH -Ic:\temp\venv\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" -Ic:\temp\venv\include -IC:\Users\Creative\AppData\Local\Programs\Python\Python310\include -IC:\Users\Creative\AppData\Local\Programs\Python\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c c:\temp\src\cuda\ms_deform_attn_cuda.cu -o c:\temp\build\temp.win-amd64-cpython-310\Release\temp\src\cuda\ms_deform_attn_cuda.obj -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 FAILED: c:/temp/build/temp.win-amd64-cpython-310/Release/temp/src/cuda/ms_deform_attn_cuda.obj C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc --generate-dependencies-with-compile --dependency-output c:\temp\build\temp.win-amd64-cpython-310\Release\temp\src\cuda\ms_deform_attn_cuda.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -Ic:\temp\src -Ic:\temp\venv\lib\site-packages\torch\include -Ic:\temp\venv\lib\site-packages\torch\include\torch\csrc\api\include -Ic:\temp\venv\lib\site-packages\torch\include\TH -Ic:\temp\venv\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" -Ic:\temp\venv\include -IC:\Users\Creative\AppData\Local\Programs\Python\Python310\include -IC:\Users\Creative\AppData\Local\Programs\Python\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c c:\temp\src\cuda\ms_deform_attn_cuda.cu -o c:\temp\build\temp.win-amd64-cpython-310\Release\temp\src\cuda\ms_deform_attn_cuda.obj -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\crt/host_config.h(153): fatal error C1189: #error: -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. ms_deform_attn_cuda.cu ninja: build stopped: subcommand failed. Traceback (most recent call last): File "c:\temp\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2107, in _run_ninja_build subprocess.run( File "C:\Users\Creative\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "c:\temp\setup.py", line 64, in setup( File "c:\temp\venv\lib\site-packages\setuptools__init__.py", line 87, in setup return distutils.core.setup(**attrs) File "c:\temp\venv\lib\site-packages\setuptools_distutils\core.py", line 185, in setup return run_commands(dist) File "c:\temp\venv\lib\site-packages\setuptools_distutils\core.py", line 201, in run_commands dist.run_commands() File "c:\temp\venv\lib\site-packages\setuptools_distutils\dist.py", line 968, in run_commands self.run_command(cmd) File "c:\temp\venv\lib\site-packages\setuptools\dist.py", line 1217, in run_command super().run_command(command) File "c:\temp\venv\lib\site-packages\setuptools_distutils\dist.py", line 987, in run_command cmd_obj.run() File "c:\temp\venv\lib\site-packages\setuptools_distutils\command\build.py", line 132, in run self.run_command(cmd_name) File "c:\temp\venv\lib\site-packages\setuptools_distutils\cmd.py", line 319, in run_command self.distribution.run_command(command) File "c:\temp\venv\lib\site-packages\setuptools\dist.py", line 1217, in run_command super().run_command(command) File "c:\temp\venv\lib\site-packages\setuptools_distutils\dist.py", line 987, in run_command cmd_obj.run() File "c:\temp\venv\lib\site-packages\setuptools\command\build_ext.py", line 84, in run _build_ext.run(self) File "c:\temp\venv\lib\site-packages\setuptools_distutils\command\build_ext.py", line 346, in run self.build_extensions() File "c:\temp\venv\lib\site-packages\torch\utils\cpp_extension.py", line 870, in build_extensions build_ext.build_extensions(self) File "c:\temp\venv\lib\site-packages\setuptools_distutils\command\build_ext.py", line 466, in build_extensions self._build_extensions_serial() File "c:\temp\venv\lib\site-packages\setuptools_distutils\command\build_ext.py", line 492, in _build_extensions_serial self.build_extension(ext) File "c:\temp\venv\lib\site-packages\setuptools\command\build_ext.py", line 246, in build_extension _build_ext.build_extension(self, ext) File "c:\temp\venv\lib\site-packages\setuptools_distutils\command\build_ext.py", line 547, in build_extension objects = self.compiler.compile( File "c:\temp\venv\lib\site-packages\torch\utils\cpp_extension.py", line 842, in win_wrap_ninja_compile _write_ninja_file_and_compile_objects( File "c:\temp\venv\lib\site-packages\torch\utils\cpp_extension.py", line 1783, in _write_ninja_file_and_compile_objects _run_ninja_build( File "c:\temp\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2123, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension

(venv) c:\temp>

thanks for the patience and the intent to help me, anyway, I have spent the entire day on this, but still not able to build it.

I will give it a break, though I feel totally dejected when I can't run LivePortrait. I will wait till I find a solution.

Do you have a file C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat? If yes, here what you can try instead of the python setup.py build command:

set DISTUTILS_USE_SDK=1
set MSSdk=1
set CUDA_HOME=%CUDA_PATH%
"C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat" amd64 && python setup.py build

Before running these commands, don't forget to have the venv activated if you reuse the C:\Temp folder : venv\Scripts\activate.

Is it better this way?

If you don't have the vcvarsall.bat file to the same location than me, you can find its location by running the following commands:

"C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere" -version [16.0,17.10) -prerelease -requires Microsoft.VisualStudio.Component.VC.Tools.x86.x64 -property installationPath -products *

or:

"C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere" -version [16.0,17.10) -prerelease -requires Microsoft.VisualStudio.Workload.WDExpress -property installationPath -products *

One of these two commands should give you a location. For me, the first one gives me: C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools.

Then append \VC\Auxiliary\Build\vcvarsall.bat to find where your vcvarsall.bat file is (for me C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat).

Thanks for the suggestions, I will try again tomorrow.

Do you have a file C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat? If yes, here what you can try instead of the python setup.py build command:
set DISTUTILS_USE_SDK=1
set MSSdk=1
set CUDA_HOME=%CUDA_PATH%
"C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat" amd64 && python setup.py build
Before running these commands, don't forget to have the venv activated if you reuse the C:\Temp folder : venv\Scripts\activate.

Is it better this way?

This solution worked, it generated the pyd file and animal images tab on Live Portrait works.

but when starting to animate it terminates with CUDA out of memory error.

[09:12:57] Load appearance_feature_extractor from live_portrait_wrapper.py:361 C:\Users\Creative\AppData\Roaming\Stabi
lityMatrix\Packages\Stable Diffusion
WebUI
Forge\models\liveportraitanimals\base
models\appearance_feature_extractor.saf
etensors done.
Load motion_extractor from live_portrait_wrapper.py:364 C:\Users\Creative\AppData\Roaming\Stabi
lityMatrix\Packages\Stable Diffusion
WebUI
Forge\models\liveportraitanimals\base
models\motion_extractor.safetensors
done.
Load warping_module from live_portrait_wrapper.py:367 C:\Users\Creative\AppData\Roaming\Stabi
lityMatrix\Packages\Stable Diffusion
WebUI
Forge\models\liveportraitanimals\base
models\warping_module.safetensors done.
[09:12:58] Load spade_generator from live_portrait_wrapper.py:370 C:\Users\Creative\AppData\Roaming\Stabi
lityMatrix\Packages\Stable Diffusion
WebUI
Forge\models\liveportraitanimals\base
models\spade_generator.safetensors
done.
Load stitching_retargeting_module from live_portrait_wrapper.py:374 C:\Users\Creative\AppData\Roaming\Stabi
lityMatrix\Packages\Stable Diffusion
WebUI
Forge\models\liveportrait_animals\retar
geting_models\stitching_retargeting_mod
ule.safetensors done.
[09:13:03] FaceAnalysisDIY warmup time: 1.990s face_analysis_diy.py:79 [09:13:04] LandmarkRunner warmup time: 0.980s human_landmark_runner.py:95 Loaded cached embeddings from file. Traceback (most recent call last): File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\queueing.py", line 536, in process_events response = await route_utils.call_process_api( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\route_utils.py", line 285, in call_process_api output = await app.get_blocks().process_api( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\blocks.py", line 1923, in process_api result = await self.call_function( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\blocks.py", line 1508, in call_function prediction = await anyio.to_thread.run_sync( # type: ignore File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, args) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\utils.py", line 818, in wrapper response = f(args, kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\scripts\main.py", line 191, in gpu_wrapped_execute_video_animal pipeline = init_gradio_pipeline_animal() File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\scripts\main.py", line 162, in init_gradio_pipeline_animal gradio_pipeline_animal = GradioPipelineAnimal( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\gradio_pipeline.py", line 649, in init super().init(inference_cfg, crop_cfg) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\live_portrait_pipeline_animal.py", line 49, in init self.cropper: Cropper = Cropper(crop_cfg=crop_cfg, image_type='animal_face', flag_use_half_precision=inference_cfg.flag_use_half_precision) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\cropper.py", line 84, in init self.animal_landmark_runner.warmup() File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\animal_landmark_runner.py", line 137, in warmup self.run(img_rgb, 'face', 'face', box_threshold=0.0, IoU_threshold=0.0) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\animal_landmark_runner.py", line 120, in run boxes_filt, keypoints_filt = self.get_unipose_output(image, instance_text_prompt, keypoint_text_prompt, box_threshold, IoU_threshold) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\animal_landmark_runner.py", line 87, in get_unipose_output outputs = self.model(image[None], [target]) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\unipose.py", line 400, in forward hs, reference, hs_enc, ref_enc, init_box_proposal = self.transformer(srcs, masks, input_query_bbox, poss, File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\deformable_transformer.py", line 473, in forward hs, references = self.decoder( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\deformable_transformer.py", line 826, in forward output = layer( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\deformable_transformer.py", line 1092, in forward tgt2 = self.self_attn(q, k, tgt, attn_mask=self_attn_mask)[0] File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, **kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\activation.py", line 1266, in forward attn_output, attn_output_weights = F.multi_head_attention_forward( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\functional.py", line 5470, in multi_head_attention_forward attn_output_weights = softmax(attn_output_weights, dim=-1) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\functional.py", line 1885, in softmax ret = input.softmax(dim) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 364.00 MiB. GPU

any solution for this will be helpful, Thanks again for the patience and clear steps to build the .pyd file.

after downgrading cuda version, even human images on LivePortrait are generating complete black screen videos. so removed liveportrait and reverting back to the previous nvidia driver.

Thanks for the help. I will wait till the issues are fixed.

I don't have access to my computer right now but what you can try is to unload the stable diffusion model from VRAM to free some space. I know there are some options to do this, I'll give them to you as soon as I can get to my computer.

Also, how much VRAM (memory of your graphic card) do you have?

I have 4GB Vram, and shared video memory of 12GB(RAM), out of total 24GB RAM, it was working before for human faces, but now after placing the Xpose .pyd file in the lib folder, animal faces gives Cuda out of memory error and human images render complete black screen videos.

when I reverted back to display driver with Cuda 12.6 ( I installed stability matrix with this driver installed),

Liveportrait works again, and I copied the generated .pyd file to lib folder of Unipose/ops

and animating the animal images work as well, but I get another error now, when I checked the output folder the video was completely black, human images are working fine.

this is the error I got.

[12:55:32] FaceAnalysisDIY warmup time: 3.827s face_analysis_diy.py:79 [12:55:34] LandmarkRunner warmup time: 2.190s human_landmark_runner.py:95 Loaded cached embeddings from file. [12:55:41] XPoseRunner warmup time: 4.613s animal_landmark_runner.py:140 Load source image from live_portrait_pipeline_animal.py:86 C:\Users\Creative\AppData\Local\
Temp\gradio\tmp3nv_7dtf.png
Load from template: live_portrait_pipeline_animal.py:97 C:\Users\Creative\AppData\Local\
Temp\gradio\97f6b3a1cd3accff8b6a
b0ac244ff2fe26cfbc63\wink.pkl,
NOT the video, so the cropping
video and audio are both NULL.
[12:55:42] The FPS of template: 25 live_portrait_pipeline_animal.py:103 🚀Animating... ---------------------------------------- 100% 0:02:39 Concatenating result... ---------------------------------------- 100% 0:00:00 Writing ---------------------------------------- 100% 0:00:00 Writing ---------------------------------------- 100% 0:00:00 [12:58:29] Animated video: live_portrait_pipeline_animal.py:236 C:\Users\Creative\AppData\Roami
ng\StabilityMatrix\Packages\Sta
ble Diffusion WebUI
Forge\outputs\live-portrait\202
4-10-27\tmp3nv_7dtf--wink.mp4
Animated video with concat: live_portrait_pipeline_animal.py:237 C:\Users\Creative\AppData\Roami
ng\StabilityMatrix\Packages\Sta
ble Diffusion WebUI
Forge\outputs\live-portrait\202
4-10-27\tmp3nv_7dtf--wink_conca
t.mp4
Traceback (most recent call last): File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\queueing.py", line 536, in process_events response = await route_utils.call_process_api( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\route_utils.py", line 285, in call_process_api output = await app.get_blocks().process_api( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\blocks.py", line 1923, in process_api result = await self.call_function( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\blocks.py", line 1508, in call_function prediction = await anyio.to_thread.run_sync( # type: ignore File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, args) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\utils.py", line 818, in wrapper response = f(args, kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\scripts\main.py", line 192, in gpu_wrapped_execute_video_animal return pipeline.execute_video(*args, *kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\gradio_pipeline.py", line 712, in execute_video video_path, video_path_concat, video_gif_path = self.execute(self.args) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\live_portrait_pipeline_animal.py", line 240, in execute wfp_gif = video2gif(wfp) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\video.py", line 58, in video2gif exec_cmd(cmd) File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\video.py", line 22, in exec_cmd return subprocess.run(cmd, shell=True, check=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) File "subprocess.py", line 526, in run subprocess.CalledProcessError: Command 'ffmpeg -i "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\outputs\live-portrait\2024-10-27\tmp3nv_7dtf--wink.mp4" -vf "fps=30,scale=256:-1:flags=lanczos,palettegen" "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\outputs\live-portrait\2024-10-27\palette.png" -y' returned non-zero exit status 1.

the error was due to old ffmpeg versions, which I replaced with new one, and the error was gone,

It's working without any error, but I get black screen videos for animal faces.

Thanks for all the help.

finally it worked, when I use closeup face images without do crop(source) option, it generates the video without problem, when I use full body images it generates complete black screen video with and without do crop(source) option.

https://github.com/user-attachments/assets/d2c4d6e6-5858-4dff-8a95-319b992964cd

Black videos are usually a sign that you don't have enough VRAM. The original author of Live Portrait seems to say that 4GB (unshared) is the minimum requirement (see here, but that was before they added the animal mode, which was added on August 2, 2024).

Cropping the source or driving image/video requires VRAM, which may explain why it works by disabling it.

What you can try to limit the amount of VRAM used:

Go to the Forge "settings" tab before using LivePortrait and in the "Other->Actions" section (on the left), click on the "Unload all models"
Go to the Forge "settings" tab and under the "Uncategorized->Live Portrait" section (on the left), select another "Human face detector", MediaPipe for instance, which may use less VRAM (even for animal mode, the driving video is using a "Human face detector", this is precisely why animal mode requires more VRAM than human mode)
Try to launch SD Forge with the --medvram or --lowvram parameter (if not already set), as in the screenshot below:

Try to install the original LivePortrait application and launch the app_animals.py file (outside of Stability Matrix)

Thanks for the suggestions, I will experiment with those settings.

dimitribarbot / sd-webui-live-portrait

forge adaptation #3

Installation procedure for StabilityMatrix