Open CoqueTornado opened 1 month ago
Hi,
Yes, I guess it's because Forge updated its gradio version to 4.x while SD WebUI is still in 3.x. I have to check this carefully because:
.name
property in 3.x while for 4.x we have the file name directly,I'll see next week how to do this.
Hi @CoqueTornado,
I've updated the extension code to enable compatibility with Forge by fixing issues with Pydantic and Gradio. You should be able to get these modifications by going to your "Extensions" tab and clicking on "Check for updates" and then "Apply and restart UI".
Is it working for you in Forge now?
I have installed this in my Forge installation and it is working. Thank you for bringing this tool to Forge!
I have installed this in my Forge installation and it is working. Thank you for bringing this tool to Forge!
SD forge is currently running on CUDA 12.1
Can you pls list the steps, I have been trying to install it for the past 16 hours. I installed Nvidia latest driver with Cuda 12.6.65 and Cuda toolkit 12.6.2, installed VS Build tools both 2019 and 2022, still I get the error.
"Building of OP file for XPose has failed. Check the log file in the extension's 'logs' folder for more information."
Should I install the same CUDA version as forge?. Kindly explain the procedure. Though Live Portrait works for Human heads.
Should I install the same CUDA version as forge?
Yes, absolutely, otherwise it will not work (it's only the major version which is important though, either 11 or 12). If you look at the sd-forge\extensions\sd-webui-live-portrait\logs\xpose.err.log
file, you should have an error message saying this (otherwise if you can post here the content of this file, it could help me).
Your CUDA_PATH
environment variable should be something like C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6
if your Forge CUDA version is v12.6 (see here if you don't know how to change environment variables). You can check that the version of CUDA is correct by launching a cmd window (type Window+R and then cmd) and running the nvcc --version
command. You should see something like this:
Then, close and restart SD Forge for it to take this new environment variable into account (don't forget this as otherwise SD Forge will still use the old environment variable value if you changed it) and then, in the Live Portrait Animals tab, click on the "Reinstall XPose and Restart WebUI" button.
Please, tell me if this solves your installation issue.
I used nvidia cleanup tool and cleared all the nvidia drivers and toolkits, then did a fresh install of Cuda 12.1.0 toolkit with display driver.
Visual Studio still has 2019 and 2022 build tools.
this is the xpose.log file, ( xpose.err.log is empty )
Visual Studio 2019 Developer Command Prompt v16.11.39 Copyright (c) 2021 Microsoft Corporation
[vcvarsall.bat] Environment initialized for: 'x86_x64' running build running build_py creating build creating build\lib.win-amd64-3.10 creating build\lib.win-amd64-3.10\functions copying functions\ms_deform_attn_func.py -> build\lib.win-amd64-3.10\functions copying functions__init.py -> build\lib.win-amd64-3.10\functions creating build\lib.win-amd64-3.10\modules copying modules\ms_deform_attn.py -> build\lib.win-amd64-3.10\modules copying modules\ms_deform_attn_key_aware.py -> build\lib.win-amd64-3.10\modules copying modules__init__.py -> build\lib.win-amd64-3.10\modules running build_ext building 'MultiScaleDeformableAttention' extension creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10 creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476 creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cpu creating C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\utils\cpp_extension.py:1967: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation. If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST']. warnings.warn( Emitting ninja build file C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc -DWITH_CUDA -IC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\torch\csrc\api\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\TH" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\THC" "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cpu\ms_deform_attn_cpu.cpp /FoC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cpu\ms_deform_attn_cpu.obj -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17 [2/3] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc -DWITH_CUDA -IC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\torch\csrc\api\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\TH" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\THC" "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\vision.cpp /FoC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\vision.obj -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17 FAILED: C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/build/temp.win-amd64-3.10/Release/Users/Creative/AppData/Local/Temp/tmp66gfm476/src/vision.obj cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc -DWITH_CUDA -IC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\torch\csrc\api\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\TH" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\THC" "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\vision.cpp /FoC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\vision.obj -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17 C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\torch/csrc/python_headers.h(12): fatal error C1083: Cannot open include file: 'Python.h': No such file or directory [3/3] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc --generate-dependencies-with-compile --dependency-output C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -IC:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\torch\csrc\api\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\TH" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\include\THC" "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\include" "-IC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\Scripts\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu -o C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\build\temp.win-amd64-3.10\Release\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.obj -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(261): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_im2col_gpu_kernel(int, const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t ) [with scalar_t=double]" at line 946 instantiation of "void ms_deformable_im2col_cuda(cudaStream_t, const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t ) [with scalar_t=double]" at line 64 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
Remark: The warnings can be suppressed with "-diag-suppress
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(261): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_im2col_gpu_kernel(int, const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t ) [with scalar_t=float]" at line 946 instantiation of "void ms_deformable_im2col_cuda(cudaStream_t, const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t ) [with scalar_t=float]" at line 64 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(762): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v2_multi_blocks(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 1001 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(872): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_gm(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 1024 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=1U]" at line 1050 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=2U]" at line 1072 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=4U]" at line 1094 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=8U]" at line 1116 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=16U]" at line 1138 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=32U]" at line 1160 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=64U]" at line 1182 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=128U]" at line 1204 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=256U]" at line 1226 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=512U]" at line 1248 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double, blockSize=1024U]" at line 1270 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(544): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v1(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 1294 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(649): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v2(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 1317 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=double]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(762): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v2_multi_blocks(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 1001 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(872): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_gm(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 1024 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=1U]" at line 1050 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=2U]" at line 1072 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=4U]" at line 1094 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=8U]" at line 1116 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=16U]" at line 1138 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(331): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=32U]" at line 1160 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=64U]" at line 1182 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=128U]" at line 1204 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=256U]" at line 1226 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=512U]" at line 1248 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(436): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2<scalar_t,blockSize>(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float, blockSize=1024U]" at line 1270 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(544): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v1(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 1294 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
C:/Users/Creative/AppData/Local/Temp/tmp66gfm476/src\cuda/ms_deform_im2col_cuda.cuh(649): warning #177-D: variable "q_col" was declared but never referenced const int q_col = _temp % num_query; ^ detected during: instantiation of "void ms_deformable_col2im_gpu_kernel_shm_reduce_v2(int, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 1317 instantiation of "void ms_deformable_col2im_cuda(cudaStream_t, const scalar_t , const scalar_t , const int64_t , const int64_t , const scalar_t , const scalar_t , int, int, int, int, int, int, int, scalar_t , scalar_t , scalar_t ) [with scalar_t=float]" at line 134 of C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\src\cuda\ms_deform_attn_cuda.cu
ms_deform_attn_cuda.cu tmpxft_0000219c_00000000-7_ms_deform_attn_cuda.cudafe1.cpp ninja: build stopped: subcommand failed. Traceback (most recent call last): File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2107, in _run_ninja_build subprocess.run( File "subprocess.py", line 526, in run subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\Creative\AppData\Local\Temp\tmp66gfm476\setup.py", line 64, in
I am running SD forge through Stability Matrix
Ah ok, I see, it's linked to StabilityMatrix then, which is not coming with the necessary Python files to install XPose (the AI model used for animal live portrait).
I managed to make it work by following this looooooooong procedure! If you're ready to continue, so am I :).
python --version
command which should display "Python 3.10.11"nvcc --version
command as discussed previouslyC:\Temp
folderC:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\stable-diffusion-webui-forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\ops
folder in C:\Temp
(you should have C:\Temp\setup.py
, C:\Temp\test.py
, some folders, etc.)cd C:\Temp
python -m venv venv
venv\Scripts\activate
After this last command, a (venv)
should have appeared on the left of the command line. Continue with:
pip install torch==2.3.1 torchvision==0.18.1 --extra-index-url https://download.pytorch.org/whl/cu121
This will download pytorch (2.4 GB) so it may take a while. After a successfull install, run:
python setup.py build
This will build the necessary file that StabilityMatrix was not able to build. If this command is successful, you should have a new file under a folder such as C:\Temp\build\lib.win-amd64-cpython-310\MultiScaleDeformableAttention.cp310-win_amd64.pyd
(the name will depend on your machine configuration but the extension of this file should be .pyd
).
If you managed to create this file, then you're almost finished. You must copy this file back in the C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\stable-diffusion-webui-forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\ops\lib
folder (create the lib
folder if it does not exist).
You can finally relaunch StabilityMatrix and verify that the animal tab is not empty anymore. If all went well, you can safely close your cmd window and remove the C:\Temp
folder.
I hope this procedure will work for you!
I tried every step closely, still not compiling the .pyd file.
installed - Python 3.10.11, Cuda 12.1
installed torch
only 2019 build tool has been installed.
when I run the build final step to build the pyd,
it ends with error like this
(venv) c:\temp>python setup.py build running build running build_py running build_ext c:\temp\venv\lib\site-packages\torch\utils\cpp_extension.py:384: UserWarning: Error checking compiler version for cl: [WinError 2] The system cannot find the file specified warnings.warn(f'Error checking compiler version for {compiler}: {error}') building 'MultiScaleDeformableAttention' extension c:\temp\venv\lib\site-packages\torch\utils\cpp_extension.py:1967: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation. If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST']. warnings.warn( Emitting ninja build file c:\temp\build\temp.win-amd64-cpython-310\Release\build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/1] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc --generate-dependencies-with-compile --dependency-output c:\temp\build\temp.win-amd64-cpython-310\Release\temp\src\cuda\ms_deform_attn_cuda.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -Ic:\temp\src -Ic:\temp\venv\lib\site-packages\torch\include -Ic:\temp\venv\lib\site-packages\torch\include\torch\csrc\api\include -Ic:\temp\venv\lib\site-packages\torch\include\TH -Ic:\temp\venv\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" -Ic:\temp\venv\include -IC:\Users\Creative\AppData\Local\Programs\Python\Python310\include -IC:\Users\Creative\AppData\Local\Programs\Python\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c c:\temp\src\cuda\ms_deform_attn_cuda.cu -o c:\temp\build\temp.win-amd64-cpython-310\Release\temp\src\cuda\ms_deform_attn_cuda.obj -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 FAILED: c:/temp/build/temp.win-amd64-cpython-310/Release/temp/src/cuda/ms_deform_attn_cuda.obj C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc --generate-dependencies-with-compile --dependency-output c:\temp\build\temp.win-amd64-cpython-310\Release\temp\src\cuda\ms_deform_attn_cuda.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -Ic:\temp\src -Ic:\temp\venv\lib\site-packages\torch\include -Ic:\temp\venv\lib\site-packages\torch\include\torch\csrc\api\include -Ic:\temp\venv\lib\site-packages\torch\include\TH -Ic:\temp\venv\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" -Ic:\temp\venv\include -IC:\Users\Creative\AppData\Local\Programs\Python\Python310\include -IC:\Users\Creative\AppData\Local\Programs\Python\Python310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" -c c:\temp\src\cuda\ms_deform_attn_cuda.cu -o c:\temp\build\temp.win-amd64-cpython-310\Release\temp\src\cuda\ms_deform_attn_cuda.obj -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\crt/host_config.h(153): fatal error C1189: #error: -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. ms_deform_attn_cuda.cu ninja: build stopped: subcommand failed. Traceback (most recent call last): File "c:\temp\venv\lib\site-packages\torch\utils\cpp_extension.py", line 2107, in _run_ninja_build subprocess.run( File "C:\Users\Creative\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "c:\temp\setup.py", line 64, in
(venv) c:\temp>
thanks for the patience and the intent to help me, anyway, I have spent the entire day on this, but still not able to build it.
I will give it a break, though I feel totally dejected when I can't run LivePortrait. I will wait till I find a solution.
Do you have a file C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat
? If yes, here what you can try instead of the python setup.py build
command:
set DISTUTILS_USE_SDK=1
set MSSdk=1
set CUDA_HOME=%CUDA_PATH%
"C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat" amd64 && python setup.py build
Before running these commands, don't forget to have the venv
activated if you reuse the C:\Temp
folder : venv\Scripts\activate
.
Is it better this way?
If you don't have the vcvarsall.bat
file to the same location than me, you can find its location by running the following commands:
"C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere" -version [16.0,17.10) -prerelease -requires Microsoft.VisualStudio.Component.VC.Tools.x86.x64 -property installationPath -products *
or:
"C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere" -version [16.0,17.10) -prerelease -requires Microsoft.VisualStudio.Workload.WDExpress -property installationPath -products *
One of these two commands should give you a location. For me, the first one gives me: C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools
.
Then append \VC\Auxiliary\Build\vcvarsall.bat
to find where your vcvarsall.bat
file is (for me C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat
).
Thanks for the suggestions, I will try again tomorrow.
Do you have a file
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat
? If yes, here what you can try instead of thepython setup.py build
command:set DISTUTILS_USE_SDK=1 set MSSdk=1 set CUDA_HOME=%CUDA_PATH% "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat" amd64 && python setup.py build
Before running these commands, don't forget to have the
venv
activated if you reuse theC:\Temp
folder :venv\Scripts\activate
.Is it better this way?
This solution worked, it generated the pyd file and animal images tab on Live Portrait works.
but when starting to animate it terminates with CUDA out of memory error.
[09:12:57] Load appearance_feature_extractor from live_portrait_wrapper.py:361
C:\Users\Creative\AppData\Roaming\Stabi
lityMatrix\Packages\Stable Diffusion
WebUI
Forge\models\liveportraitanimals\base
models\appearance_feature_extractor.saf
etensors done.
Load motion_extractor from live_portrait_wrapper.py:364
C:\Users\Creative\AppData\Roaming\Stabi
lityMatrix\Packages\Stable Diffusion
WebUI
Forge\models\liveportraitanimals\base
models\motion_extractor.safetensors
done.
Load warping_module from live_portrait_wrapper.py:367
C:\Users\Creative\AppData\Roaming\Stabi
lityMatrix\Packages\Stable Diffusion
WebUI
Forge\models\liveportraitanimals\base
models\warping_module.safetensors done.
[09:12:58] Load spade_generator from live_portrait_wrapper.py:370
C:\Users\Creative\AppData\Roaming\Stabi
lityMatrix\Packages\Stable Diffusion
WebUI
Forge\models\liveportraitanimals\base
models\spade_generator.safetensors
done.
Load stitching_retargeting_module from live_portrait_wrapper.py:374
C:\Users\Creative\AppData\Roaming\Stabi
lityMatrix\Packages\Stable Diffusion
WebUI
Forge\models\liveportrait_animals\retar
geting_models\stitching_retargeting_mod
ule.safetensors done.
[09:13:03] FaceAnalysisDIY warmup time: 1.990s face_analysis_diy.py:79
[09:13:04] LandmarkRunner warmup time: 0.980s human_landmark_runner.py:95
Loaded cached embeddings from file.
Traceback (most recent call last):
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\queueing.py", line 536, in process_events
response = await route_utils.call_process_api(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\route_utils.py", line 285, in call_process_api
output = await app.get_blocks().process_api(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\blocks.py", line 1923, in process_api
result = await self.call_function(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\blocks.py", line 1508, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, args)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\utils.py", line 818, in wrapper
response = f(args, kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\scripts\main.py", line 191, in gpu_wrapped_execute_video_animal
pipeline = init_gradio_pipeline_animal()
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\scripts\main.py", line 162, in init_gradio_pipeline_animal
gradio_pipeline_animal = GradioPipelineAnimal(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\gradio_pipeline.py", line 649, in init
super().init(inference_cfg, crop_cfg)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\live_portrait_pipeline_animal.py", line 49, in init
self.cropper: Cropper = Cropper(crop_cfg=crop_cfg, image_type='animal_face', flag_use_half_precision=inference_cfg.flag_use_half_precision)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\cropper.py", line 84, in init
self.animal_landmark_runner.warmup()
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\animal_landmark_runner.py", line 137, in warmup
self.run(img_rgb, 'face', 'face', box_threshold=0.0, IoU_threshold=0.0)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\animal_landmark_runner.py", line 120, in run
boxes_filt, keypoints_filt = self.get_unipose_output(image, instance_text_prompt, keypoint_text_prompt, box_threshold, IoU_threshold)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\animal_landmark_runner.py", line 87, in get_unipose_output
outputs = self.model(image[None], [target])
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(args, kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\unipose.py", line 400, in forward
hs, reference, hs_enc, ref_enc, init_box_proposal = self.transformer(srcs, masks, input_query_bbox, poss,
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, *kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\deformable_transformer.py", line 473, in forward
hs, references = self.decoder(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\deformable_transformer.py", line 826, in forward
output = layer(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(args, kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\dependencies\XPose\models\UniPose\deformable_transformer.py", line 1092, in forward
tgt2 = self.self_attn(q, k, tgt, attn_mask=self_attn_mask)[0]
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(args, **kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\modules\activation.py", line 1266, in forward
attn_output, attn_output_weights = F.multi_head_attention_forward(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\functional.py", line 5470, in multi_head_attention_forward
attn_output_weights = softmax(attn_output_weights, dim=-1)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\nn\functional.py", line 1885, in softmax
ret = input.softmax(dim)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 364.00 MiB. GPU
any solution for this will be helpful, Thanks again for the patience and clear steps to build the .pyd file.
after downgrading cuda version, even human images on LivePortrait are generating complete black screen videos. so removed liveportrait and reverting back to the previous nvidia driver.
Thanks for the help. I will wait till the issues are fixed.
I don't have access to my computer right now but what you can try is to unload the stable diffusion model from VRAM to free some space. I know there are some options to do this, I'll give them to you as soon as I can get to my computer.
Also, how much VRAM (memory of your graphic card) do you have?
I have 4GB Vram, and shared video memory of 12GB(RAM), out of total 24GB RAM, it was working before for human faces, but now after placing the Xpose .pyd file in the lib folder, animal faces gives Cuda out of memory error and human images render complete black screen videos.
when I reverted back to display driver with Cuda 12.6 ( I installed stability matrix with this driver installed),
Liveportrait works again, and I copied the generated .pyd file to lib folder of Unipose/ops
and animating the animal images work as well, but I get another error now, when I checked the output folder the video was completely black, human images are working fine.
this is the error I got.
[12:55:32] FaceAnalysisDIY warmup time: 3.827s face_analysis_diy.py:79
[12:55:34] LandmarkRunner warmup time: 2.190s human_landmark_runner.py:95
Loaded cached embeddings from file.
[12:55:41] XPoseRunner warmup time: 4.613s animal_landmark_runner.py:140
Load source image from live_portrait_pipeline_animal.py:86
C:\Users\Creative\AppData\Local\
Temp\gradio\tmp3nv_7dtf.png
Load from template: live_portrait_pipeline_animal.py:97
C:\Users\Creative\AppData\Local\
Temp\gradio\97f6b3a1cd3accff8b6a
b0ac244ff2fe26cfbc63\wink.pkl,
NOT the video, so the cropping
video and audio are both NULL.
[12:55:42] The FPS of template: 25 live_portrait_pipeline_animal.py:103
🚀Animating... ---------------------------------------- 100% 0:02:39
Concatenating result... ---------------------------------------- 100% 0:00:00
Writing ---------------------------------------- 100% 0:00:00
Writing ---------------------------------------- 100% 0:00:00
[12:58:29] Animated video: live_portrait_pipeline_animal.py:236
C:\Users\Creative\AppData\Roami
ng\StabilityMatrix\Packages\Sta
ble Diffusion WebUI
Forge\outputs\live-portrait\202
4-10-27\tmp3nv_7dtf--wink.mp4
Animated video with concat: live_portrait_pipeline_animal.py:237
C:\Users\Creative\AppData\Roami
ng\StabilityMatrix\Packages\Sta
ble Diffusion WebUI
Forge\outputs\live-portrait\202
4-10-27\tmp3nv_7dtf--wink_conca
t.mp4
Traceback (most recent call last):
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\queueing.py", line 536, in process_events
response = await route_utils.call_process_api(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\route_utils.py", line 285, in call_process_api
output = await app.get_blocks().process_api(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\blocks.py", line 1923, in process_api
result = await self.call_function(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\blocks.py", line 1508, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, args)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\gradio\utils.py", line 818, in wrapper
response = f(args, kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\scripts\main.py", line 192, in gpu_wrapped_execute_video_animal
return pipeline.execute_video(*args, *kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(args, kwargs)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\gradio_pipeline.py", line 712, in execute_video
video_path, video_path_concat, video_gif_path = self.execute(self.args)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\live_portrait_pipeline_animal.py", line 240, in execute
wfp_gif = video2gif(wfp)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\video.py", line 58, in video2gif
exec_cmd(cmd)
File "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\extensions\sd-webui-live-portrait\liveportrait\utils\video.py", line 22, in exec_cmd
return subprocess.run(cmd, shell=True, check=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
File "subprocess.py", line 526, in run
subprocess.CalledProcessError: Command 'ffmpeg -i "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\outputs\live-portrait\2024-10-27\tmp3nv_7dtf--wink.mp4" -vf "fps=30,scale=256:-1:flags=lanczos,palettegen" "C:\Users\Creative\AppData\Roaming\StabilityMatrix\Packages\Stable Diffusion WebUI Forge\outputs\live-portrait\2024-10-27\palette.png" -y' returned non-zero exit status 1.
the error was due to old ffmpeg versions, which I replaced with new one, and the error was gone,
It's working without any error, but I get black screen videos for animal faces.
Thanks for all the help.
finally it worked, when I use closeup face images without do crop(source) option, it generates the video without problem, when I use full body images it generates complete black screen video with and without do crop(source) option.
https://github.com/user-attachments/assets/d2c4d6e6-5858-4dff-8a95-319b992964cd
Black videos are usually a sign that you don't have enough VRAM. The original author of Live Portrait seems to say that 4GB (unshared) is the minimum requirement (see here, but that was before they added the animal mode, which was added on August 2, 2024).
Cropping the source or driving image/video requires VRAM, which may explain why it works by disabling it.
What you can try to limit the amount of VRAM used:
app_animals.py
file (outside of Stability Matrix)Thanks for the suggestions, I will experiment with those settings.
with the new Forge,
driving_video_pickle_input = gr.File(type="file", file_types=[".pkl"])
change to driving_video_pickle_input = gr.File(file_types=[".pkl"])
thanks in advance!