pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
https://pytorch.org/TensorRT
BSD 3-Clause "New" or "Revised" License
2.5k stars 344 forks source link

❓ [Question] Building torch_tensorrt.lib on Windows #1014

Closed jonahclarsen closed 2 years ago

jonahclarsen commented 2 years ago

❓ Question

I am wondering how to build the torch_tensorrt.lib on Windows.

What you have already tried

I have followed #960 and #856 (with the same WORKSPACE as the latter) and managed to successfully build torch_tensorrt.dll. However, I need the .lib file in order to compile my Libtorch program. I tried linking to some of the .lib files that were created already (like bazel-out\x64_windows-opt\bin\cpp\torch_tensorrt.lo.lib), but that didn't work. I expect it's a fairly simple bazel command, but I have no idea where to put it.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

Additional context

My libtorch program runs fine even if I include the torch-tensorrt headers, but throws the following errors as soon as I try to use torch_tensorrt::torchscript::CompileSpec and call torch_tensorrt::torchscript::compile: Error LNK1120 2 unresolved externals Omkar 1.10.0+cu113 B:\Programming_Current Projects\HelloLibTorch\x64\Release\HelloTorch.exe 1

Error LNK2019 unresolved external symbol "public: __cdecl torch_tensorrt::torchscript::CompileSpec::CompileSpec(class std::vector<class std::vector<int64,class std::allocator<int64> >,class std::allocator<class std::vector<int64,class std::allocator<int64> > > >)" (??0CompileSpec@torchscript@torch_tensorrt@@QEAA@V?$vector@V?$vector@_JV?$allocator@_J@std@@@std@@V?$allocator@V?$vector@_JV?$allocator@_J@std@@@std@@@2@@std@@@Z) referenced in function main Omkar 1.10.0+cu113 B:\Programming_Current Projects\HelloLibTorch\main.obj 1

Error LNK2019 unresolved external symbol "struct torch::jit::Module __cdecl torch_tensorrt::torchscript::compile(struct torch::jit::Module const &,struct torch_tensorrt::torchscript::CompileSpec)" (?compile@torchscript@torch_tensorrt@@YA?AUModule@jit@torch@@AEBU345@UCompileSpec@12@@Z) referenced in function main Omkar 1.10.0+cu113 B:\Programming_Current Projects\HelloLibTorch\main.obj 1

jonahclarsen commented 2 years ago

I managed to create the .lib file from the .dll following these instructions: https://web.archive.org/web/20140219172454/https://adrianhenke.wordpress.com/2008/12/05/create-lib-file-from-dll/ (adding /MACHINE:X64 to the lib command), but the .dll didn't contain the torchscript namespace and its functions/classes.

I found that the torchscript namespace was in cpp_objs\torch_tensorrt\compile_spec.obj, but when I tried turning the .objs in that folder into a .lib (with VS tools' "lib": https://stackoverflow.com/questions/31763558/how-to-build-static-and-dynamic-libraries-from-obj-files-for-visual-c) and linking it in my probject, it just created a bunch of new 'unresolved external symbol' linker errors.

Any suggestions on how to build .lib and .dll files for all of torch_tensorrt, including compile_spec, ptq, etc.?

@narendasan perhaps you would have some insight into this? Thanks!

narendasan commented 2 years ago

Are the symbols from the top level namespace available? Like torch_tensorrt::Input? Unfortunately I don't have a ton of experience with windows development.

Perhaps bazel can directly generate the .lib for you. The final library targets in the repo are located at //cpp/lib/BUILD. There is already a dll target, perhaps you could also make a .lib target as well. I think all you would need to do is copy the dll target but change the file type to .lib. Also I was looking at the cc_binary documentation and it seems there is a field called win_def_file

The Windows DEF file to be passed to linker.

This attribute should only be used when Windows is the target platform. It can be used to export symbols during linking a shared library.

Not sure if this would help here.

jonahclarsen commented 2 years ago

@narendasan Even torch_tensorrt::Input throws an 'unresolved external symbol' linker error.

I tried adding a .lib target to cpp/lib/BUILD just under the .dll target, but it doesn't do anything.

There is a 4KB torch_tensorrt.def file in cpp/lib, but when I added it as shown below (I tried with both the .dll and .lib targets):

cc_binary( name = "torch_tensorrt.dll", srcs = [], linkshared = True, linkstatic = True, deps = [ "//cpp:torch_tensorrt", ], win_def_file="torch_tensorrt.def", )

cc_binary( name = "torch_tensorrt.lib", srcs = [], linkshared = True, linkstatic = True, deps = [ "//cpp:torch_tensorrt", ], win_def_file="torch_tensorrt.def", )

I got the following error: INFO: Analyzed target //:libtorchtrt (1 packages loaded, 4 targets configured). INFO: Found 1 target... ERROR: C:/users/jonah/downloads/torch-tensorrt-1.0.0/cpp/lib/BUILD:34:10: Linking cpp/lib/torch_tensorrt.dll failed: missing input file '//cpp/lib:torch_tensorrt.def' Target //:libtorchtrt failed to build Use --verbose_failures to see the command lines of failed build steps. ERROR: C:/users/jonah/downloads/torch-tensorrt-1.0.0/cpp/lib/BUILD:34:10 Linking cpp/lib/torch_tensorrt.dll failed: 1 input file(s) do not exist INFO: Elapsed time: 0.294s, Critical Path: 0.01s INFO: 1 process: 1 internal. FAILED: Build did NOT complete successfully

And it wouldn't let me put the full directory as it threw an error if the filename included a ':', '/', or '\'.

Thanks for your help so far.

narendasan commented 2 years ago

So the lib target will not get included in the package builder target //:libtorchtrt (defined in //BUILD). Probably its best to start by trying to build //cpp/lib:torch_tensorrt.lib before dealing with packaging.

I was reading a bit more on def files and it seems like it's not really necessary, but we might need to change the TORCHTRT_API macro (https://github.com/NVIDIA/Torch-TensorRT/blob/master/cpp/include/torch_tensorrt/macros.h) to add __declspec(dllexport) to the source: https://docs.microsoft.com/en-us/cpp/build/exporting-from-a-dll-using-declspec-dllexport?view=msvc-170

narendasan commented 2 years ago

Re the missing file error, that is because bazel expects a def file in //cpp/lib

jonahclarsen commented 2 years ago

@narendasan there is a def file in //cpp/lib

narendasan commented 2 years ago

Hmm and this is the source //cpp/lib and not like //bazel-torch-tensorrt/cpp/lib? Maybe try clearing the cache?

bazel clean --expunge


From: Jonah @.> Sent: Sunday, May 1, 2022 3:30:48 PM To: NVIDIA/Torch-TensorRT @.> Cc: Naren Dasan @.>; Mention @.> Subject: Re: [NVIDIA/Torch-TensorRT] ❓ [Question] Building torch_tensorrt.lib on Windows (Issue #1014)

@narendasanhttps://github.com/narendasan there is a def file in //cpp/lib

— Reply to this email directly, view it on GitHubhttps://github.com/NVIDIA/Torch-TensorRT/issues/1014#issuecomment-1114353237, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AANVFFM2GGMHJEKSDMCKS5DVH4AZRANCNFSM5UWGEY3A. You are receiving this because you were mentioned.Message ID: @.***>

jonahclarsen commented 2 years ago

I tried running bazel build //cpp/lib:torch_tensorrt.lib --compilation_mode opt and got the following (with no .lib file generated):

bazel build //cpp/lib:torch_tensorrt.lib --compilation_mode opt INFO: Analyzed target //cpp/lib:torch_tensorrt.lib (1 packages loaded, 1 target configured). INFO: Found 1 target... INFO: From Linking cpp/lib/torch_tensorrt.lib.dll: LINK : warning LNK4044: unrecognized option '/lpthread'; ignored LINK : warning LNK4044: unrecognized option '/lpthread'; ignored LINK : warning LNK4044: unrecognized option '/Wl,-rpath,lib/'; ignored Creating library bazel-out/x64_windows-opt/bin/cpp/lib/torch_tensorrt.lib.if.lib and object bazel-out/x64_windows-opt/bin/cpp/lib/torch_tensorrt.lib.if.exp Target //cpp/lib:torch_tensorrt.lib up-to-date: bazel-bin/cpp/lib/torch_tensorrt.lib.dll INFO: Elapsed time: 1.062s, Critical Path: 0.78s INFO: 9 processes: 8 internal, 1 local. INFO: Build completed successfully, 9 total actions

I realized it was creating a .lib.dll file, so I changed linkstatic = to False for the .lib target. This resulted in this error:

bazel build //cpp/lib:torch_tensorrt.lib --compilation_mode opt INFO: Analyzed target //cpp/lib:torch_tensorrt.lib (1 packages loaded, 1 target configured). INFO: Found 1 target... ERROR: C:/users/jonah/downloads/torch-tensorrt-1.0.0/core/util/BUILD:72:11: Linking core/util/trt_util_09159dd4ae.dll failed: (Exit 1120): link.exe failed: error executing command C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/HostX64/x64/link.exe @bazel-out/x64_windows-opt/bin/core/util/trt_util_09159dd4ae.dll-2.params LINK : warning LNK4044: unrecognized option '/Wl,-rpath,lib/'; ignored Creating library bazel-out/x64_windows-opt/bin/core/util/trt_util.if.lib and object bazel-out/x64_windows-opt/bin/core/util/trt_util.if.exp trt_util.obj : error LNK2019: unresolved external symbol "public: void cdecl torch_tensorrt::core::util::logging::TorchTRTLogger::log(enum torch_tensorrt::core::util::logging::LogLevel,class std::basic_string<char,struct std::char_traits,class std::allocator >)" (?log@TorchTRTLogger@logging@util@core@torch_tensorrt@@QEAAXW4LogLevel@2345@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@Z) referenced in function "class nvinfer1::Dims32 __cdecl torch_tensorrt::core::util::toDimsPad(class c10::ArrayRef<int64>,unsigned int64)" (?toDimsPad@util@core@torch_tensorrt@@YA?AVDims32@nvinfer1@@V?$ArrayRef@_J@c10@@_K@Z) trt_util.obj : error LNK2019: unresolved external symbol "class torch_tensorrt::core::util::logging::TorchTRTLogger & cdecl torch_tensorrt::core::util::logging::get_logger(void)" (?get_logger@logging@util@core@torch_tensorrt@@YAAEAVTorchTRTLogger@1234@XZ) referenced in function "class nvinfer1::Dims32 cdecl torch_tensorrt::core::util::toDimsPad(class c10::ArrayRef<int64>,unsigned int64)" (?toDimsPad@util@core@torch_tensorrt@@YA?AVDims32@nvinfer1@@V?$ArrayRef@_J@c10@@_K@Z) trt_util.obj : error LNK2019: unresolved external symbol "public: cdecl torch_tensorrt::Error::Error(char const ,unsigned int,class std::basic_string<char,struct std::char_traits,class std::allocator > const &,void const )" (??0Error@torch_tensorrt@@QEAA@PEBDIAEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@PEBX@Z) referenced in function "enum nvinfer1::DataType __cdecl torch_tensorrt::core::util::ScalarTypeToTRTDataType(enum c10::ScalarType)" (?ScalarTypeToTRTDataType@util@core@torch_tensorrt@@YA?AW4DataType@nvinfer1@@W4ScalarType@c10@@@Z) bazel-out\x64_windows-opt\bin\core\util\trt_util_09159dd4ae.dll : fatal error LNK1120: 3 unresolved externals Target //cpp/lib:torch_tensorrt.lib failed to build Use --verbose_failures to see the command lines of failed build steps. ERROR: C:/users/jonah/downloads/torch-tensorrt-1.0.0/cpp/lib/BUILD:44:10 Linking cpp/lib/torch_tensorrt.lib.dll failed: (Exit 1120): link.exe failed: error executing command C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/HostX64/x64/link.exe @bazel-out/x64_windows-opt/bin/core/util/trt_util_09159dd4ae.dll-2.params INFO: Elapsed time: 0.346s, Critical Path: 0.08s INFO: 8 processes: 8 internal. FAILED: Build did NOT complete successfully

jonahclarsen commented 2 years ago

Yes this is //cpp/lib. Just tried clearing the cache, ran mostly til the end then gave the same error but of course ending this time in:

INFO: Elapsed time: 59.844s, Critical Path: 17.10s INFO: 762 processes: 677 internal, 85 local. FAILED: Build did NOT complete successfully

jonahclarsen commented 2 years ago

I also modified line 11 of macros.h to #define TORCHTRT_API __attribute__((__visibility__("default"))) __declspec(dllexport) and was able to successfully build the .dll with that as before, the .dll is the same size though so I'm not sure if it made any difference.

narendasan commented 2 years ago

Should it be all one line or should it be it’s own MVSC case for TORCHTRT_API?


From: Jonah @.> Sent: Sunday, May 1, 2022 3:53:52 PM To: NVIDIA/Torch-TensorRT @.> Cc: Naren Dasan @.>; Mention @.> Subject: Re: [NVIDIA/Torch-TensorRT] ❓ [Question] Building torch_tensorrt.lib on Windows (Issue #1014)

I also modified line 11 of macros.h to #define TORCHTRT_API attribute((visibility("default"))) __declspec(dllexport) and was able to successfully build the .dll with that as before, the .dll is the same size though so I'm not sure if it made any difference.

— Reply to this email directly, view it on GitHubhttps://github.com/NVIDIA/Torch-TensorRT/issues/1014#issuecomment-1114356449, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AANVFFIZOODOEXKWTRFV7ZLVH4DQBANCNFSM5UWGEY3A. You are receiving this because you were mentioned.Message ID: @.***>

jonahclarsen commented 2 years ago

I'm not sure what you mean, if I define TORCHTRT_API again then won't it just replace the existing attribute definition?

jonahclarsen commented 2 years ago

I tried it on a new line as so:

define TORCHTRT_API attribute((visibility("default")))

define TORCHTRT_API __declspec(dllexport)

And was again able to build the .dll successfully (it was the same size as before), but still not the .lib file.

narendasan commented 2 years ago

I think something more along the lines of this https://ms-iot.github.io/ROSOnWindows/Porting/SymbolVisibility.html

Since I believe the __attribute__ flags are mostly for GCC.

jonahclarsen commented 2 years ago

Okay, that was a helpful lead. I added __declspec(dllexport) (using a DLLExport macro to make it a little easier) to the declaration and definition of the three functions that were displaying the linking errors (for each of their overloads), and those errors went away, but I got a similar error with reference to toDims() in trt_util, so I did the same thing there for the two overloads. This produced a similar error with another 4 unresolved externals, including two functions from the same file as the original three errors (TorchTRTLogger).

I am going to keep going down this pattern but I am starting to think this will just be a game of catch-up where I have to go add __declspec(dllexport) before functions for hours. Is there a better way to go about this?

Perhaps I'm misunderstanding it, but it seems the reason #define TORCHTRT_API __declspec(dllexport) isn't helping is that the TORCHTRT_API macro isn't already before these function definitions/declarations.

Update - after adding it to those two functions, we get 21 unresolved externals. CMake's WINDOWS_EXPORT_ALL_SYMBOLS from the article you linked seems like it would be useful if we were using CMake.

jonahclarsen commented 2 years ago

I found a similar command in bazel, and tried executing bazel build //cpp/lib:torch_tensorrt.lib --compilation_mode opt --features=windows_export_all_symbols. I got the following error:

bazel build //cpp/lib:torch_tensorrt.lib --compilation_mode opt INFO: Build option --features has changed, discarding analysis cache. INFO: Analyzed target //cpp/lib:torch_tensorrt.lib (0 packages loaded, 2559 targets configured). INFO: Found 1 target... INFO: From Linking core/util/logging/logging_b704e7774d.dll: LINK : warning LNK4044: unrecognized option '/Wl,-rpath,lib/'; ignored Creating library bazel-out/x64_windows-opt/bin/core/util/logging/logging.if.lib and object bazel-out/x64_windows-opt/bin/core/util/logging/logging.if.exp INFO: From Linking core/util/exception_09159dd4ae.dll: Creating library bazel-out/x64_windows-opt/bin/core/util/exception.if.lib and object bazel-out/x64_windows-opt/bin/core/util/exception.if.exp INFO: From Linking core/util/trt_util_09159dd4ae.dll: LINK : warning LNK4044: unrecognized option '/Wl,-rpath,lib/'; ignored Creating library bazel-out/x64_windows-opt/bin/core/util/trt_util.if.lib and object bazel-out/x64_windows-opt/bin/core/util/trt_util.if.exp ERROR: C:/users/jonah/downloads/torch-tensorrt-1.0.0-attempt2/core/conversion/conversionctx/BUILD:10:11: Linking core/conversion/conversionctx/conversionctx_6f33a490f5.dll failed: (Exit 1120): link.exe failed: error executing command C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/HostX64/x64/link.exe ... (remaining 1 argument(s) skipped) LINK : warning LNK4044: unrecognized option '/Wl,-rpath,lib/'; ignored Creating library bazel-out/x64_windows-opt/bin/core/conversion/conversionctx/conversionctx.if.lib and object bazel-out/x64_windows-opt/bin/core/conversion/conversionctx/conversionctx.if.exp ConversionCtx.obj : error LNK2019: unresolved external symbol createInferBuilder_INTERNAL referenced in function "public: cdecl torch_tensorrt::core::conversion::ConversionCtx::ConversionCtx(struct torch_tensorrt::core::conversion::BuilderSettings)" (??0ConversionCtx@conversion@core@torch_tensorrt@@QEAA@UBuilderSettings@123@@Z) ConversionCtx.obj : error LNK2019: unresolved external symbol "public: __cdecl torch_tensorrt::core::util::logging::TorchTRTLogger::TorchTRTLogger(class std::basic_string<char,struct std::char_traits,class std::allocator >,enum nvinfer1::ILogger::Severity,bool)" (??0TorchTRTLogger@logging@util@core@torch_tensorrt@@QEAA@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@W4Severity@ILogger@nvinfer1@@_N@Z) referenced in function "public: cdecl torch_tensorrt::core::conversion::ConversionCtx::ConversionCtx(struct torch_tensorrt::core::conversion::BuilderSettings)" (??0ConversionCtx@conversion@core@torch_tensorrt@@QEAA@UBuilderSettings@123@@Z) bazel-out\x64_windows-opt\bin\core\conversion\conversionctx\conversionctx_6f33a490f5.dll : fatal error LNK1120: 2 unresolved externals Target //cpp/lib:torch_tensorrt.lib failed to build Use --verbose_failures to see the command lines of failed build steps. INFO: Elapsed time: 0.911s, Critical Path: 0.52s INFO: 33 processes: 30 internal, 3 local. FAILED: Build did NOT complete successfully

This seems like it might be related to this discussion: https://github.com/bazelbuild/bazel/issues/11622

jonahclarsen commented 2 years ago

I managed to fix the TorchTRTLogger errors too by again adding DllExport. It seems that features=windows_export_all_symbols (like CMake's WINDOWS_EXPORT_ALL_SYMBOLS) doesn't actually export all symbols.

However, I then got 21 unresolved externals, which was all the functions in core/lowering/passes except UnpackBatchNorm. Added DllExport to those.

After that, I got this error related to linking nvinfer, which I noticed came up in #226 as well:

ERROR: C:/users/jonah/downloads/torch-tensorrt-1.0.0-attempt2/core/conversion/conversionctx/BUILD:10:11: Linking core/conversion/conversionctx/conversionctx_6f33a490f5.dll failed: (Exit 1120): link.exe failed: error executing command C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/HostX64/x64/link.exe ... (remaining 1 argument(s) skipped) LINK : warning LNK4044: unrecognized option '/Wl,-rpath,lib/'; ignored Creating library bazel-out/x64_windows-opt/bin/core/conversion/conversionctx/conversionctx.if.lib and object bazel-out/x64_windows-opt/bin/core/conversion/conversionctx/conversionctx.if.exp ConversionCtx.obj : error LNK2019: unresolved external symbol createInferBuilder_INTERNAL referenced in function "public: __cdecl torch_tensorrt::core::conversion::ConversionCtx::ConversionCtx(struct torch_tensorrt::core::conversion::BuilderSettings)" (??0ConversionCtx@conversion@core@torch_tensorrt@@QEAA@UBuilderSettings@123@@Z) bazel-out\x64_windows-opt\bin\core\conversion\conversionctx\conversionctx_6f33a490f5.dll : fatal error LNK1120: 1 unresolved externals Target //cpp/lib:torch_tensorrt.lib failed to build

Resolved that issue by adding "nvinfer_static_lib", after line 87 in third_party/tensorrt/local/BUILD.

Then I went through a series of linker errors resolved by adding DllExport on all overloads for each declaration and definition:

Then:

And again:

However, when I added DllExport to the LogLevel overload so it was on both overloads, I got this ambiguous call error:

ERROR: C:/users/jonah/downloads/torch-tensorrt-1.0.0-attempt2/core/util/logging/BUILD:10:11: Compiling core/util/logging/TorchTRTLogger.cpp failed: (Exit 2): cl.exe failed: error executing command C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/HostX64/x64/cl.exe /nologo /DCOMPILER_MSVC /DNOMINMAX /D_WIN32_WINNT=0x0601 /D_CRT_SECURE_NO_DEPRECATE ... (remaining 47 argument(s) skipped) cl : Command line warning D9002 : ignoring unknown option '-fdiagnostics-color=always' cl : Command line warning D9002 : ignoring unknown option '-std=c++14' .\core/util/logging/TorchTRTLogger.h(23): error C2668: 'torch_tensorrt::core::util::logging::TorchTRTLogger::TorchTRTLogger': ambiguous call to overloaded function .\core/util/logging/TorchTRTLogger.h(26): note: could be 'torch_tensorrt::core::util::logging::TorchTRTLogger::TorchTRTLogger(std::string,torch_tensorrt::core::util::logging::LogLevel,bool)' .\core/util/logging/TorchTRTLogger.h(25): note: or 'torch_tensorrt::core::util::logging::TorchTRTLogger::TorchTRTLogger(std::string,nvinfer1::ILogger::Severity,bool)' .\core/util/logging/TorchTRTLogger.h(42): note: while trying to match the argument list '()' .\core/util/logging/TorchTRTLogger.h(42): note: This diagnostic occurred in the compiler generated function 'void torch_tensorrt::core::util::logging::TorchTRTLogger::__dflt_ctor_closure(void)' .\core/util/logging/TorchTRTLogger.h(42): note: see reference to function 'void torch_tensorrt::core::util::logging::TorchTRTLogger::__dflt_ctor_closure(void)' Target //cpp/lib:torch_tensorrt.lib failed to build Use --verbose_failures to see the command lines of failed build steps. INFO: Elapsed time: 0.887s, Critical Path: 0.63s INFO: 17 processes: 17 internal. FAILED: Build did NOT complete successfully

I tried removing DllExport from the Severity overload and having it just on the LogLevel one, but unfortunately I still get the ambiguous call error whether it's on the LogLevel one or both (and still get the 'unresolved external symbol error' when I apply DllExport to only the Severity overload).

I honestly don't remember what happened after this, but it went away in lieu of another unresolved external error, this time for core::util::toDimsPad, which I applied the same DllExport solution to. This brought back the TorchTRTLogger unresolved external error as well as one for core::plugins::impl::TorchTRTPluginRegistry::TorchTRTPluginRegistry, which I again resolved with DllExport (on line 18 of core/plugins/register_plugins.cpp).

I then got another unresolved linker error for core::conversion::converters::tensor_to_const (in core/conversion/converters/converter_util.cpp/h), and resolved it in the same way. Then "unresolved external symbol initLibNvInferPlugins referenced in function "public: __cdecl torch_tensorrt::core::plugins::impl::TorchTRTPluginRegistry::TorchTRTPluginRegistry(void)" and again the same error as before for the LogLevel overload of TorchTRTLogger.

The PluginRegistry error seems to be a problem with linking nvinferplugins. I tried to resolve this by adding "@tensorrt//:nvinferplugin", after line 49 in cpp/lib/BUILD and then resolving another error by removing DllExport from tensor_to_const in converter_util.h/cpp, but the nvinfer-plugin error persisted. I tried linking to nvinfer-plugin by adding "nvinferplugin", after line 89 in third_party/tensorrt/local/BUILD, which resulted in recursive references, which I resolved by commenting out line 309 "nvinfer",.

This didn't seem to fix the problem, as I was left again with unresolved external symbol errors for initLibNvInferPlugins referenced in core::plugins::impl::TorchTRTPluginRegistry::TorchTRTPluginRegistry and for core::util::logging::TorchTRTLogger::TorchTRTLogger (LogLevel overload).

I decided to deal with the overloads of the TorchTRTLogger constructor creating the ambiguous call error. This is occurring because a compiler generated function calls the constructor with default parameters, and since both constructors can be defaulted, it's unsure which one to use. I managed to find a solution that I actually think is ideal since LogLevel is basically just a wrapper for Severity, involving one constructor and overflows for other functions in the class: https://gist.github.com/jonahclarsen/c3cc6b08271d28042963c2d2bca65931 (it still needs cleaning up).

Now I am left again with the unresolved external symbol initLibNvInferPlugins error. I tried again to resolve that by changing line 94 in third_party/tensorrt/local/BUILD from ":windows": ["@cuda//:cublas"], to ":windows": ["@cuda//:cublas", "nvinferplugin"],, as well as a lot of other combinations in plugins/tensorrt/local/BUILD including those from #690, but I can't seem to figure out how to solve it.

Any help on this would be greatly appreciated.

narendasan commented 2 years ago

Which target is missing initLibNvInferPlugins? Is it //core/plugins? It should be already declared as a target in the build file in that directory. Not sure if you need to do anything special for extern functions in windows. (https://github.com/NVIDIA/TensorRT/blob/main/include/NvInferPlugin.h)

narendasan commented 2 years ago

Also do you have a branch you can share even if its dirty that we can look at?

jonahclarsen commented 2 years ago

I am not sure how to create a branch. However, I am abandoning building with Bazel (at least for now) because I was able to finally get the binaries created with CMake thanks to @gcuendet at #1007 (although I still haven't been able to use them in my Libtorch project without getting errors).

However, if you or anyone else wants to pursue building with Bazel further, here are the files/folders I modified on top of TRT-1.0.0: Torch-TensorRT-1.0.0-attempt2.zip