ONNXRuntimeError when trying to convert images.

steelsteed commented 3 months ago

So I was able to build under Windows 11 using Python 3.11.9 with a virtual environment

To get the installation to work, I had to do the following:

"C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat" amd64
python -m venv .\unique3d
.\unique3d\Scripts\activate.bat
pip install wheel
set DISTUTILS_USE_SDK=1
install_windows_win_py311_cu121.bat

This got the packages to setup and install correctly. I then used git with lfs to checkout the checkpoints, and after doing a git lfs pull on that, I moved the ckpt folder into the same folder as Unique3D as per the instructions.

I am then able to run the UI with python app/gradio_local.py --port 7860 from the activated virtual environment.

However, when I try and convert an image, it gets about half way, and gives me an error: 2024-06-17 13:35:55.9492693 [E:onnxruntime:Default, provider_bridge_ort.cc:1730 onnxruntime::TryGetProviderInfo_TensorRT] C:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1426 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "D:\git\Unique3D\unique3d\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_tensorrt.dll"

Note that during the install I did answer Y to have it uninstall onnxruntime and install the version it wanted. However, on failing like this, it appears to fall back to CPU mode, and then takes literally forever, its giving me a 12 hour plus ETA for the demo image of baby groot.

` (unique3d) D:\git\Unique3D>python app/gradio_local.py --port 7860 Warning: Unable to load the following plugins:

    filter_embree.dll: filter_embree.dll does not seem to be a Qt Plugin.

Cannot load library D:\git\Unique3D\unique3d\Lib\site-packages\pymeshlab\lib\plugins\filter_embree.dll: The specified module could not be found. filter_func.dll: filter_func.dll does not seem to be a Qt Plugin.

Cannot load library D:\git\Unique3D\unique3d\Lib\site-packages\pymeshlab\lib\plugins\filter_func.dll: The specified module could not be found. filter_mesh_alpha_wrap.dll: filter_mesh_alpha_wrap.dll does not seem to be a Qt Plugin.

Cannot load library D:\git\Unique3D\unique3d\Lib\site-packages\pymeshlab\lib\plugins\filter_mesh_alpha_wrap.dll: The specified module could not be found. filter_mesh_booleans.dll: filter_mesh_booleans.dll does not seem to be a Qt Plugin.

Cannot load library D:\git\Unique3D\unique3d\Lib\site-packages\pymeshlab\lib\plugins\filter_mesh_booleans.dll: The specified module could not be found. filter_sketchfab.dll: filter_sketchfab.dll does not seem to be a Qt Plugin.

Cannot load library D:\git\Unique3D\unique3d\Lib\site-packages\pymeshlab\lib\plugins\filter_sketchfab.dll: The specified module could not be found. io_3ds.dll: io_3ds.dll does not seem to be a Qt Plugin.

Cannot load library D:\git\Unique3D\unique3d\Lib\site-packages\pymeshlab\lib\plugins\io_3ds.dll: The specified module could not be found. io_e57.dll: io_e57.dll does not seem to be a Qt Plugin.

Cannot load library D:\git\Unique3D\unique3d\Lib\site-packages\pymeshlab\lib\plugins\io_e57.dll: The specified module could not be found.

Warning! extra parameter in cli is not verified, may cause erros. Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 12.92it/s] You have disabled the safety checker for <class 'custum_3d_diffusion.custum_pipeline.unifield_pipeline_img2mvimg.StableDiffusionImage2MVCustomPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 . Warning! extra parameter in cli is not verified, may cause erros. D:\git\Unique3D\unique3d\Lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True. warnings.warn( Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 5018.31it/s] You have disabled the safety checker for <class 'custum_3d_diffusion.custum_pipeline.unifield_pipeline_img2img.StableDiffusionImageCustomPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 . D:\git\Unique3D\unique3d\Lib\site-packages\torch\utils\cpp_extension.py:1967: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation. If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST']. warnings.warn( Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:02<00:00, 2.48it/s] Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 5994.72it/s] Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). D:\git\Unique3D\unique3d\Lib\site-packages\diffusers\models\attention_processor.py:1279: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.) hidden_states = F.scaled_dot_product_attention( 0%| | 0/30 [00:00<?, ?it/s]Warning! condition_latents is not None, but self_attn_ref is not enabled! This warning will only be raised once. 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [41:12<00:00, 82.41s/it] D:\git\Unique3D\unique3d\Lib\site-packages\torch\nn\modules\conv.py:456: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ..\aten\src\ATen\native\cudnn\Conv_v8.cpp:919.) return F.conv2d(input, weight, bias, self.stride, 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [04:31<00:00, 27.17s/it] 2024-06-17 13:35:55.9492693 [E:onnxruntime:Default, provider_bridge_ort.cc:1730 onnxruntime::TryGetProviderInfo_TensorRT] C:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1426 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "D:\git\Unique3D\unique3d\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_tensorrt.dll"

EP Error EP Error C:\a_work\1\s\onnxruntime\python\onnxruntime_pybind_state.cc:456 onnxruntime::python::RegisterTensorRTPluginsAsCustomOps Please install TensorRT libraries as mentioned in the GPU requirements page, make sure they're in the PATH or LD_LIBRARY_PATH, and that your GPU is supported. when using [('TensorrtExecutionProvider', {'device_id': 0, 'trt_max_workspace_size': 8589934592, 'trt_fp16_enable': True, 'trt_engine_cache_enable': True}), ('CUDAExecutionProvider', {'device_id': 0, 'arena_extend_strategy': 'kSameAsRequested', 'gpu_mem_limit': 8589934592, 'cudnn_conv_algo_search': 'HEURISTIC'})] Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.

7%|██████████▌ | 2/30 [35:05<8:38:59, 1112.11s/it] `

There is some debug output showing the error in question. Please advise on how I can perhaps resolve the issue to it can find the right files ? I also came across this bug for a similar error https://github.com/danielgatis/rembg/issues/312 which seems to show that it fails because the linked cuda dlls are not found in the path it is expecting. I am not sure I am clued up enough to resolve this myself however.

If you have any advise on things I can try, please let me know.

Thanks,

Steelsteed.

jtydhr88 commented 3 months ago

hi, regarding to the EP Error , you need to configure TensorRT on your end, see https://github.com/AiuniAI/Unique3D/issues/15

steelsteed commented 3 months ago

Thank you, I will try this out :)

tomyu168 commented 2 months ago

same error，

(unique3d-python310) F:\Unique3D>python app/gradio_local.py --port 7860 Warning! extra parameter in cli is not verified, may cause erros. Loading pipeline components...: 100%|████████████████████████████████████████████████████| 5/5 [00:00<00:00, 5.32it/s] You have disabled the safety checker for <class 'custum_3d_diffusion.custum_pipeline.unifield_pipeline_img2mvimg.StableDiffusionImage2MVCustomPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 . Warning! extra parameter in cli is not verified, may cause erros. D:\Anaconda3\envs\unique3d-python310\lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True. warnings.warn( Loading pipeline components...: 100%|███████████████████████████████████████████████████| 5/5 [00:00<00:00, 557.04it/s] You have disabled the safety checker for <class 'custum_3d_diffusion.custum_pipeline.unifield_pipeline_img2img.StableDiffusionImageCustomPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 . D:\Anaconda3\envs\unique3d-python310\lib\site-packages\torch\utils\cpp_extension.py:1967: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation. If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST']. warnings.warn( Loading pipeline components...: 100%|████████████████████████████████████████████████████| 6/6 [00:01<00:00, 3.15it/s] Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. Loading pipeline components...: 100%|██████████████████████████████████████████████████| 6/6 [00:00<00:00, 2005.24it/s] Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). D:\Anaconda3\envs\unique3d-python310\lib\site-packages\diffusers\models\attention_processor.py:1279: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.) hidden_states = F.scaled_dot_product_attention( 0%| | 0/30 [00:00<?, ?it/s]Warning! condition_latents is not None, but self_attn_ref is not enabled! This warning will only be raised once. 100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [00:06<00:00, 4.90it/s] 100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:13<00:00, 1.37s/it] 2024-07-04 20:27:21.5003224 [E:onnxruntime:Default, provider_bridge_ort.cc:1731 onnxruntime::TryGetProviderInfo_TensorRT] D:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1426 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\onnxruntime\capi\onnxruntime_providers_tensorrt.dll"

EP Error EP Error D:\a_work\1\s\onnxruntime\python\onnxruntime_pybind_state.cc:456 onnxruntime::python::RegisterTensorRTPluginsAsCustomOps Please install TensorRT libraries as mentioned in the GPU requirements page, make sure they're in the PATH or LD_LIBRARY_PATH, and that your GPU is supported. when using [('TensorrtExecutionProvider', {'device_id': 0, 'trt_max_workspace_size': 8589934592, 'trt_fp16_enable': True, 'trt_engine_cache_enable': True}), ('CUDAExecutionProvider', {'device_id': 0, 'arena_extend_strategy': 'kSameAsRequested', 'gpu_mem_limit': 8589934592, 'cudnn_conv_algo_search': 'HEURISTIC'})] Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.

2024-07-04 20:27:21.6046424 [E:onnxruntime:Default, provider_bridge_ort.cc:1745 onnxruntime::TryGetProviderInfo_CUDA] D:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1426 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"

Traceback (most recent call last): File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 419, in init self._create_inference_session(providers, provider_options, disabled_optimizers) File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 469, in _create_inference_session self._register_ep_custom_ops(session_options, providers, provider_options, available_providers) File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 516, in _register_ep_custom_ops C.register_tensorrt_plugins_as_custom_ops(session_options, provider_options[i]) RuntimeError: D:\a_work\1\s\onnxruntime\python\onnxruntime_pybind_state.cc:456 onnxruntime::python::RegisterTensorRTPluginsAsCustomOps Please install TensorRT libraries as mentioned in the GPU requirements page, make sure they're in the PATH or LD_LIBRARY_PATH, and that your GPU is supported.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\gradio\queueing.py", line 541, in process_events response = await route_utils.call_process_api( File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\gradio\route_utils.py", line 276, in call_process_api output = await app.get_blocks().process_api( File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\gradio\blocks.py", line 1928, in process_api result = await self.call_function( File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\gradio\blocks.py", line 1514, in call_function prediction = await anyio.to_thread.run_sync( File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\anyio_backends_asyncio.py", line 2177, in run_sync_in_worker_thread return await future File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\anyio_backends_asyncio.py", line 859, in run result = context.run(func, args) File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\gradio\utils.py", line 833, in wrapper response = f(args, **kwargs) File "F:\Unique3D.\app\gradio_3dgen.py", line 21, in generate3dv2 new_meshes = geo_reconstruct(rgb_pils, None, front_pil, do_refine=do_refine, predict_normal=True, expansion_weight=expansion_weight, init_type=init_type) File "F:\Unique3D.\scripts\multiview_inference.py", line 70, in geo_reconstruct img_list = [front_pil] + run_sr_fast(refined_rgbs[1:]) File "F:\Unique3D.\scripts\refine_lr_to_sr.py", line 39, in run_sr_fast upsampler = RealESRGANer( File "F:\Unique3D.\scripts\upsampler.py", line 48, in init self.model = load_onnx_caller(onnx_path, single_output=True) File "F:\Unique3D.\scripts\load_onnx.py", line 27, in load_onnx_caller ort_session = load_onnx(file_path) File "F:\Unique3D.\scripts\load_onnx.py", line 22, in load_onnx ort_session = onnxruntime.InferenceSession(file_path, sess_opt=sess_opt, providers=providers) File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 432, in init raise fallback_error from e File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 427, in init self._create_inference_session(self._fallback_providers, None) File "D:\Anaconda3\envs\unique3d-python310\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 483, in _create_inference_session sess.initialize_session(providers, provider_options, disabled_optimizers) RuntimeError: D:\a_work\1\s\onnxruntime\python\onnxruntime_pybind_state.cc:891 onnxruntime::python::CreateExecutionProviderInstance CUDA_PATH is set but CUDA wasnt able to be loaded. Please install the correct version of CUDA andcuDNN as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported.

tomyu168 commented 2 months ago

hi, regarding to the EP Error , you need to configure TensorRT on your end, see #15

安装指南首先就有有问题，xformers不支持2.3.1版本，pip install torch torchvision torchaudio xformers --index-url https://download.pytorch.org/whl/cu121 这么搞会由于没有合适的版本一直循环下载xformers，简直崩溃了。

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 若是先这么搞，再安装xformers会把torch2.3.1+cu121卸载自动安装torch2.3.0+cpu，然后xformers又提示无法load cuda啥的。这时得手动卸载torch再pip install torch==2.3.0+cu121 https://download.pytorch.org/whl/cu121.

还有清华源里没ort_nightly_gpu，requirements因为这个安装卡壳。得用这个 pip install ort_nightly_gpu --index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/

所有的环境变量各种版本我都添加进去了。没得用

import onnxruntime onnxruntime.get_available_providers() 显示 ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']。onnxruntime.get_device() 显示gpu。

python 3.11 3.10我都试过了不行，可能改天再试一下cuda卸载换cu118再搞一次？我发现把onnxruntime-gpu卸载了安装onnxruntime可以运行起来，不过也会出现大堆报错，只成功生成出一次mesh。

作者大哥看看有啥办法

wukailu commented 2 months ago

hi, regarding to the EP Error , you need to configure TensorRT on your end, see #15

安装指南首先就有有问题，xformers不支持2.3.1版本，pip install torch torchvision torchaudio xformers --index-url https://download.pytorch.org/whl/cu121 这么搞会由于没有合适的版本一直循环下载xformers，简直崩溃了。

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 若是先这么搞，再安装xformers会把torch2.3.1+cu121卸载自动安装torch2.3.0+cpu，然后xformers又提示无法load cuda啥的。这时得手动卸载torch再pip install torch==2.3.0+cu121 https://download.pytorch.org/whl/cu121.

还有清华源里没ort_nightly_gpu，requirements因为这个安装卡壳。得用这个 pip install ort_nightly_gpu --index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/

所有的环境变量各种版本我都添加进去了。没得用

import onnxruntime onnxruntime.get_available_providers() 显示 ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']。onnxruntime.get_device() 显示gpu。

python 3.11 3.10我都试过了不行，可能改天再试一下cuda卸载换cu118再搞一次？我发现把onnxruntime-gpu卸载了安装onnxruntime可以运行起来，不过也会出现大堆报错，只成功生成出一次mesh。

作者大哥看看有啥办法

只需要执行 pip install xformers torch==<你当前的torch版本> 它就会自动安装正确的xformers，且不会重装你的torch。

ort_nightly_gpu 等效于 onnxruntime-gpu，只是方便指定一下版本。tensorrt 不好安装，需要配置正确的环境变量。可以考虑删除代码里的 TensorrtExecutionProvider, 只保留 CUDAExecutionProvider，这样就只用卸载 onnxruntime 并安装正确版本的 onnxruntime-gpu 就行，这个比较好装，而且速度上只会慢一点点。

AiuniAI / Unique3D

ONNXRuntimeError when trying to convert images. #31