Closed noman-anjum-retro closed 1 year ago
@andi4191 can you take a look at this as well when you look at the other windows issue?
@noman-anjum-retro: Don't you need to load torchtrt_runtime.dll using python on windows?
@andi4191 I'm not sure how to load it. When I tried to load it via this code:
`import torch
import ctypes
hllDll = ctypes.WinDLL ("D:/Codes/RetroActivity/experiments/src/exploration/action_recognition/torchtrt_runtime.dll")
print("DLL Loaded")
model = torch.jit.load(r"D:\NewTRTModel.ts")`
it throwed an error:
hllDll = ctypes.WinDLL ("D:/Codes/RetroActivity/experiments/src/exploration/action_recognition/torchtrt_runtime.dll") File "C:\Users\NomanAnjum\AppData\Local\Programs\Python\Python310\lib\ctypes\__init__.py", line 374, in __init__ self._handle = _dlopen(self._name, mode) FileNotFoundError: Could not find module 'D:\Codes\RetroActivity\experiments\src\exploration\action_recognition\torchtrt_runtime.dll' (or one of its dependencies). Try using the full path with constructor syntax.
Since path is correct it's definitely some other issue
update: I checked all the dependent files of torchtrt_runtime.dll and I found some more libraries with it. I got some of them from TensorRT and some from libtorch. Now the error while loading the dll has changed to:
File "D:\Codes\RetroActivity\experiments\src\exploration\action_recognition\video_play.py", line 198, in get_torch_tensorrt_converted_model hllDll = ctypes.WinDLL(r"D:\Codes\RetroActivity\experiments\src\exploration\action_recognition\torchtrt_runtime.dll") File "C:\Users\NomanAnjum\AppData\Local\Programs\Python\Python310\lib\ctypes\__init__.py", line 374, in __init__ self._handle = _dlopen(self._name, mode) OSError: [WinError 1114] A dynamic link library (DLL) initialization routine failed
@noman-anjum-retro did you get it working? facing the same issue here,
No, still waiting for @narendasan or @andi4191 help
@noman-anjum-retro: Can you try this? I added the option in torchtrtc to support custom torch op or torch_tensorrt converter. I have used windows.h header file and LoadLibrary for loading the symbol tables from a dll file.
@andi4191 What's the solution you are suggesting? Should I recompile my TRT module on C++ with this change and try to load it in python?
Yes, the fix was merged into master so go ahead and try it and see if fixes your issue
Hello @andi4191 , no it didn't help it popped the same errors in series mentioned above.
Hello @andi4191 @narendasan, I tried loading compiled TRT module on C++ and python. On C++ it worked fine on Debug mode, but on C++ Release mode and In Python it throwed following error:
RuntimeError:
Unknown type name 'torch.torch.classes.tensorrt.Engine':
File "code/torch/movinets/models.py", line 4
parameters = []
buffers = []
_torch___movinets_models_MoViNet_trt_engine : torch.torch.classes.tensorrt.Engine
< -- HERE
def forward(self_1: torch.movinets.models.MoViNet_trt,
input_0: Tensor) -> Tensor:
However when I loaded torchtrt_runtime.dll on C++ release with following code and it worked:
HMODULE hLib = LoadLibrary(TEXT("torchtrt_runtime"));
if (hLib == NULL) {
std::cerr << "Library torchtrt_runtime.dll not found" << std::endl;
exit(1);
}
This makes it clear that running on windows need torchtrt_runtime.dll to be loaded. However when I'm trying to load it in python with following code:
torch.ops.load_library("/src/exploration/action_recognition/torchtrt_runtime.dll")
or
import ctypes
#hllDll = ctypes.WinDLL("/src/exploration/action_recognition/torchtrt_runtime.dll")
Library is not getting loaded and throwing following error:
hllDll = ctypes.WinDLL ("/src/exploration/action_recognition/torchtrt_runtime.dll") File "C:\Users\NomanAnjum\AppData\Local\Programs\Python\Python310\lib\ctypes\__init__.py", line 374, in __init__ self._handle = _dlopen(self._name, mode) FileNotFoundError: Could not find module '\src\exploration\action_recognition\torchtrt_runtime.dll' (or one of its dependencies). Try using the full path with constructor syntax.
I then used dependancy checker for torchtrt_runtime.dll and added all dependant dlls to same folder and error has changed to:
File "\src\exploration\action_recognition\video_play.py", line 198, in get_torch_tensorrt_converted_model hllDll = ctypes.WinDLL(r"\src\exploration\action_recognition\torchtrt_runtime.dll") File "C:\Users\NomanAnjum\AppData\Local\Programs\Python\Python310\lib\ctypes\__init__.py", line 374, in __init__ self._handle = _dlopen(self._name, mode) OSError: [WinError 1114] A dynamic link library (DLL) initialization routine failed
I have this strong feeling that if I can load torchtrt_runtime.dll it will run.
Please help me on this
It seems to be complaining about the FileNotFound
IIRC, in Windows, the paths are mentioned as:
No it didn't work. Can you please try to load a TRT on your side, maybe you'll catch something that I am missing. It's weird that it's loading in C++ and not in Python
I tried loading torchtrt_runtime.dll without importing torch in python and it loaded gracefully. Then I tried importing torch, but it failed with same error [WinError 1114] A dynamic link library (DLL) initialization routine failed. It seems that Python is unable to load both torchtrt_runtime.dll and torch at the same time, first operation gets completed and second one fails.
When torchtrt_runtime is loaded prior to torch:
Traceback (most recent call last):
File "loadDLL.py", line 5, in <module>
import torch
File "C:\Users\NomanAnjum\anaconda3\envs\py37\lib\site-packages\torch\__init__.py", line 129, in <module>
raise err
OSError: [WinError 1114] A dynamic link library (DLL) initialization routine failed. Error loading "C:\Users\NomanAnjum\anaconda3\envs\py37\lib\site-packages\torch\lib\torch_cpu.dll" or one of its dependencies.
When torch is imported prior to loading torchtrt_runtime.dll
Traceback (most recent call last):
File "loadDLL.py", line 7, in <module>
hllDll = ctypes.CDLL(r"torchtrt_runtime.dll")
File "C:\Users\NomanAnjum\anaconda3\envs\py37\lib\ctypes\__init__.py", line 364, in __init__
self._handle = _dlopen(self._name, mode)
OSError: [WinError 1114] A dynamic link library (DLL) initialization routine failed
Hi @noman-anjum-retro,
I think the problem is with the mode while you are trying to load the symbol tables.
I tried following and is working:
import ctypes
import torch
handle = ctypes.WinDLL("<Path to torchtrt artifacts>\\torchtrt_runtime.dll", winmode=1)
print(handle)
...
For quick reference: https://docs.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryexa https://github.com/python/cpython/blob/3.9/Lib/ctypes/__init__.py#L358
Additionally, can you also check if the torch installed on your machine is compatible with the torch-tensorrt?
Thanks for the help, I tried it and .dll got loaded along with torch, however torch.jit.load(
RuntimeError: Unknown type name 'torch.torch.classes.tensorrt.Engine': File "code/torch/movinets/models.py", line 4 parameters = [] buffers = [] _torch___movinets_models_MoViNet_trt_engine : torch.torch.classes.tensorrt.Engine < -- HERE def forward(self_1: torch.movinets.models.MoViNet_trt, input_0: Tensor) -> Tensor:
then I imported tensorrt in python, and this error got vanished but now the python is quitting on loading without any error or exception:
`try:
torch.jit.load("NewTRTModel843_80.ts")
print("Success")
except Exception as e:
print(e)`
Print statement is not getting executed neither any exception or error.It happens even without loading the .dll
@narendasan @andi4191 any update on this?
Hi all, I'm also trying to make Torch-TensorRT work with Python on Windows. I was able to compile TensorRT from source using CMake and to reproduce the following as posted by @andi4191 (I had to copy over the .dll dependencies as suggested by @noman-anjum-retro):
import ctypes
import torch
handle = ctypes.WinDLL("<Path to torchtrt artifacts>\\torchtrt_runtime.dll", winmode=1)
print(handle)
However I'm not sure what I'm supposed to do with the handle returned by WinDLL. My next objective is to call any function from the torch_tensorrt
module, like torch_tensorrt.compile
, but import torch_tensorrt
reports ModuleNotFoundError: No module named 'torch_tensorrt'
(I guess it's because the module was compiled from source and not installed via pip
), and handle.compile
reports AttributeError: function 'compile' not found
.
Is there any obvious step that I'm missing?
EDIT: I'm not an expert on this, but I guess there should be a custom version of setup.py
to wrap the .dll into a package that Python can recognize when I run python setup.py install
. The existing setup.py
in the source tree is obviously written to work only with Linux, is there a special version of it for the Windows build that I'm not finding? Or am I supposed to write my own?
@narendasan @andi4191 any update on this?
torch.torch.classes.tensorrt.Engine
stands out as kinda weird, id expect this to just be torch.classes.tensorrt.Engine
. Is it possible to upload the compiled ts module so we can take a look?
Hi all, I'm also trying to make Torch-TensorRT work with Python on Windows. I was able to compile TensorRT from source using CMake and to reproduce the following as posted by @andi4191 (I had to copy over the .dll dependencies as suggested by @noman-anjum-retro):
import ctypes import torch handle = ctypes.WinDLL("<Path to torchtrt artifacts>\\torchtrt_runtime.dll", winmode=1) print(handle)
However I'm not sure what I'm supposed to do with the handle returned by WinDLL. My next objective is to call any function from the
torch_tensorrt
module, liketorch_tensorrt.compile
, butimport torch_tensorrt
reportsModuleNotFoundError: No module named 'torch_tensorrt'
(I guess it's because the module was compiled from source and not installed viapip
), andhandle.compile
reportsAttributeError: function 'compile' not found
.Is there any obvious step that I'm missing?
EDIT: I'm not an expert on this, but I guess there should be a custom version of
setup.py
to wrap the .dll into a package that Python can recognize when I runpython setup.py install
. The existingsetup.py
in the source tree is obviously written to work only with Linux, is there a special version of it for the Windows build that I'm not finding? Or am I supposed to write my own?
So the steps laid out above are for runtime execution of compiled modules in python, not compiling modules in python. It assumes you compiled the module using the C++ API or torchtrtc
since we haven't had time to get the bindings working for Python on Windows.
I think the best way to do this is not really some post compilation monkey patching but overhauling the setup.py to use cmake in addition to bazel.
You can see the build process for setup.py + bazel here: https://github.com/pytorch/TensorRT/blob/b1db33a06fe6e49004405431678946e9e8248ba8/py/setup.py#L118
Basically the steps are build the core library, then copy it into the python package tree, and use that to build the python bindings.
I would assume that the steps for cmake (and therefore windows) would be similar.
Thank you for the hint about using torchtrtc
to compile the model offline, which got me a step further. Now I seem to be stuck at the same point as @noman-anjum-retro :
trt_model = torch.jit.load("efficientnet_b0_traced_trt.ts")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "...\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\jit\_serialization.py", line 162, in load
cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files)
RuntimeError:
Unknown type name '__torch__.torch.classes.tensorrt.Engine':
File "code/__torch__/timm/models/efficientnet.py", line 4
__parameters__ = []
__buffers__ = []
__torch___timm_models_efficientnet_EfficientNet_trt_engine_ : __torch__.torch.classes.tensorrt.Engine
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
def forward(self_1: __torch__.timm.models.efficientnet.EfficientNet_trt,
input_0: Tensor) -> Tensor:
The compiled module is attached: I'm trying to repro this example from NVIDIA. In my case, loading the .dll doesn't seem to have any effect: the error message is exactly the same whether I don't use WinDLL or whether I do the following
>>> import ctypes
>>> handle = ctypes.WinDLL("C:\\...\\NVIDIA\\TensorRT\\out\\install\\x64-Release\\bin\\torchtrt_runtime.dll", winmode=1)
>>> print (handle)
<WinDLL 'C:\...\NVIDIA\TensorRT\out\install\x64-Release\bin\torchtrt_runtime.dll', handle 7ffb2fae0000 at 0x1936fca0af0>
before trying to load the compiled model in Python.
@narendasan please find compiled module here. This was compiled with tensorrt 8.4.3
Opening the compiled modules (they're just zip archives) I can see that the "___torch_mangle_[0-9]+"
information is being swallowed up by the torchtrtc
compiler when running under Windows. Could this be what makes the compiled file unparseable by the loader?
any success with it @LukeRoss00 ?
No, honestly I've given up trying to run this on Windows.
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days
❓ Question
I have compiled torch_trt module using libtorch on C++ windows platform. This module is working perfectly on C++ for inference, however I want to use it in Python program on windows platform. How to load this module on python?
When I tried to load it with torch.jit.load() or torch.jit.load() it is throwing following error:
`File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py:711, in load(f, map_location, pickle_module, pickle_load_args) 707 warnings.warn("'torch.load' received a zip file that looks like a TorchScript archive" 708 " dispatching to 'torch.jit.load' (call 'torch.jit.load' directly to" 709 " silence this warning)", UserWarning) 710 opened_file.seek(orig_position) --> 711 return torch.jit.load(opened_file) 712 return _load(opened_zipfile, map_location, pickle_module, pickle_load_args) 713 return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\jit_serialization.py:164, in load(f, map_location, _extra_files) 162 cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files) 163 else: --> 164 cpp_module = torch._C.import_ir_module_from_buffer( 165 cu, f.read(), map_location, _extra_files 166 ) 168 # TODO: Pretty sure this approach loses ConstSequential status and such 169 return wrap_cpp_module(cpp_module)
RuntimeError: Unknown type name 'torch.torch.classes.tensorrt.Engine': File "code/torch/movinets/models.py", line 4 parameters = [] buffers = [] torch___movinets_models_MoViNet_trtengine : torch__.torch.classes.tensorrt.Engine