Open nietras opened 1 year ago
ClangSharp
/libclang
WalkthroughCreate simple console application in for example a Tester
directory.
dotnet new console
Add package reference to ClangSharp
package so csproj
looks like:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net7.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="ClangSharp" Version="16.0.0" />
</ItemGroup>
</Project>
Run dotnet restore -verbosity:detailed > restore.txt
on project. Verbosity set
to be able to check what happens. Nothing of worth here. Look in .nuget
package cache to see what is downloaded:
"C:\Users\<USERNAME>\.nuget\packages\clangsharp"
"C:\Users\<USERNAME>\.nuget\packages\clangsharp.interop"
"C:\Users\<USERNAME>\.nuget\packages\libclang"
"C:\Users\<USERNAME>\.nuget\packages\libclangsharp"
What's interesting here is no RID specific packages appear to be
downloaded (yet).
The ClangSharp
package has a nuspec
file with:
<?xml version="1.0" encoding="utf-8"?>
<package xmlns="http://schemas.microsoft.com/packaging/2013/05/nuspec.xsd">
<metadata minClientVersion="4.3">
<id>ClangSharp</id>
<version>16.0.0</version>
<authors>.NET Foundation and Contributors</authors>
<requireLicenseAcceptance>true</requireLicenseAcceptance>
<license type="expression">MIT</license>
<licenseUrl>https://licenses.nuget.org/MIT</licenseUrl>
<projectUrl>https://github.com/dotnet/clangsharp/</projectUrl>
<description>ClangSharp are strongly-typed safe Clang bindings written in C# for .NET and Mono, tested on Linux and Windows.</description>
<copyright>Copyright Β© .NET Foundation and Contributors</copyright>
<repository type="git" url="https://github.com/dotnet/clangsharp/" commit="1c5588c84a5d22d2ddab41dbf7854667bf722332" />
<dependencies>
<group targetFramework="net6.0">
<dependency id="ClangSharp.Interop" version="16.0.0" exclude="Build,Analyzers" />
</group>
<group targetFramework="net7.0">
<dependency id="ClangSharp.Interop" version="16.0.0" exclude="Build,Analyzers" />
</group>
<group targetFramework=".NETStandard2.0">
<dependency id="ClangSharp.Interop" version="16.0.0" exclude="Build,Analyzers" />
</group>
</dependencies>
</metadata>
</package>
Jumping over the interop package and looking at libClang
this nuspec has:
<?xml version="1.0" encoding="utf-8"?>
<package xmlns="http://schemas.microsoft.com/packaging/2013/01/nuspec.xsd">
<metadata minClientVersion="2.12">
<id>libclang</id>
<version>16.0.6</version>
<authors>.NET Foundation and Contributors</authors>
<owners>.NET Foundation and Contributors</owners>
<requireLicenseAcceptance>true</requireLicenseAcceptance>
<license type="expression">Apache-2.0 WITH LLVM-exception</license>
<licenseUrl>https://licenses.nuget.org/Apache-2.0%20WITH%20LLVM-exception</licenseUrl>
<projectUrl>https://github.com/dotnet/clangsharp</projectUrl>
<description>Multi-platform native library for libclang.</description>
<copyright>Copyright Β© LLVM Project</copyright>
<repository type="git" url="https://github.com/llvm/llvm-project" branch="llvmorg-16.0.6" />
<dependencies>
<group targetFramework=".NETStandard2.0" />
</dependencies>
</metadata>
</package>
That's interesting given it has no dependencies and contains no libraries:
"C:\Users\<USERNAME>\.nuget\packages\libclang\16.0.6\.nupkg.metadata"
"C:\Users\<USERNAME>\.nuget\packages\libclang\16.0.6\.signature.p7s"
"C:\Users\<USERNAME>\.nuget\packages\libclang\16.0.6\libclang.16.0.6.nupkg"
"C:\Users\<USERNAME>\.nuget\packages\libclang\16.0.6\libclang.16.0.6.nupkg.sha512"
"C:\Users\<USERNAME>\.nuget\packages\libclang\16.0.6\libclang.nuspec"
"C:\Users\<USERNAME>\.nuget\packages\libclang\16.0.6\LICENSE.TXT"
"C:\Users\<USERNAME>\.nuget\packages\libclang\16.0.6\runtime.json"
But what's in the runtime.json
file:
{
"runtimes": {
"linux-arm64": {
"libclang": {
"libclang.runtime.linux-arm64": "16.0.6"
}
},
"linux-x64": {
"libclang": {
"libclang.runtime.linux-x64": "16.0.6"
}
},
"osx-arm64": {
"libclang": {
"libclang.runtime.osx-arm64": "16.0.6"
}
},
"osx-x64": {
"libclang": {
"libclang.runtime.osx-x64": "16.0.6"
}
},
"win-arm64": {
"libclang": {
"libclang.runtime.win-arm64": "16.0.6"
}
},
"win-x64": {
"libclang": {
"libclang.runtime.win-x64": "16.0.6"
}
},
"win-x86": {
"libclang": {
"libclang.runtime.win-x86": "16.0.6"
}
}
}
}
Ah, that appears to map RIDs to runtime specific packages. But none were
downloaded, so what happens when we build the project. Run dotnet build -verbosity:detailed > build.txt
on project. Examining the build output and the
.nuget
cache none of those runtime specific packages appear to be downloaded
(yet). Let's try running the project with some dummy code in Program.cs
.
using ClangSharp.Interop;
using var index = CXIndex.Create();
It runs, but still no runtime specific packages downloaded nor any native
libraries in build output. Let's try a more involved example copied from a unit
test in ClangSharp
.
// https://github.com/dotnet/ClangSharp/blob/main/tests/ClangSharp.UnitTests/CXTranslationUnitTest.cs
using ClangSharp.Interop;
using static ClangSharp.Interop.CXTranslationUnit_Flags;
var name = "basic";
var dir = Path.GetRandomFileName();
_ = Directory.CreateDirectory(dir);
try
{
// Create a file with the right name
var file = new FileInfo(Path.Combine(dir, name + ".c"));
File.WriteAllText(file.FullName, "int main() { return 0; }");
using var index = CXIndex.Create();
using var translationUnit = CXTranslationUnit.Parse(
index, file.FullName, Array.Empty<string>(),
Array.Empty<CXUnsavedFile>(), CXTranslationUnit_None);
var clangFile = translationUnit.GetFile(file.FullName);
}
finally
{
Directory.Delete(dir, true);
}
This runs fine. But still no runtime specific packages downloaded nor any
native libraries in build output. Let's trying running the code in Visual
Studio with native debugging enabled. That is add launch settings with
"nativeDebugging": true
. This is just a quick way to look at which native
libraries are loaded and from where. Many ways of doing that, just using
Visual Studio since quick and easy. In the Debug window one can see:
(Win32): Loaded '\bin\Debug\net7.0\ClangSharp.Interop.dll'.
(CoreCLR: clrhost): Loaded '\bin\Debug\net7.0\ClangSharp.Interop.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
(Win32): Loaded 'C:\Program Files\LLVM\bin\libclang.dll'. Module was built without symbols.
Ah, turns out I have LLVM with clang installed π€·β So this must be in
environment variable PATH
. Which it turns out it is C:\Program Files\LLVM\bin
. Let's try removing that, and restart all consoles, applications
in use.
Running the example program again will then fail with exception:
System.DllNotFoundException: 'Unable to load DLL 'libclang' or one of its dependencies:
The specified module could not be found. (0x8007007E)'
Hmm, so the libclang
native library is not available and the package is not
downloaded automatically? How does runtime.json
then work?
Let's try running the application with a runtime identifier defined:
dotnet run -r win-x64 > run.txt
This takes a while, and only output is:
C:\Program Files\dotnet\sdk\7.0.400-preview.23274.1\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.targets(1142,5):
warning NETSDK1179: One of '--self-contained' or '--no-self-contained' options are required when '--runtime' is used.
[Tester.csproj]
Tester\4lzfbeoi.214\basic.c
but the program runs fine. Looking in .nuget
and we can see the runtime
specific packages have actually been downloaded now.
"C:\Users\<USERNAME>\.nuget\packages\libclangsharp.runtime.win-x64"
"C:\Users\<USERNAME>\.nuget\packages\libclang.runtime.win-x64"
so what this means is we cannot actually run and define the application without specifying a runtime identifier? That's seems problematic if we want to use this as framework dependent AnyCPU application... in fact if we run the application from Visual Studio again it will fail with the same exception as before.
Use tree /F
to see the files in the bin
output, which shows all the native
libraries related to libclang
for win-x64
(and others).
ββββbin
β ββββDebug
β ββββnet7.0
β β ClangSharp.dll
β β ClangSharp.Interop.dll
β β Tester.deps.json
β β Tester.dll
β β Tester.exe
β β Tester.pdb
β β Tester.runtimeconfig.json
β β
β ββββegfakait.om3
β β basic.c
β β
β ββββwin-x64
β ClangSharp.dll
β ClangSharp.Interop.dll
β clretwrc.dll
β clrgc.dll
β clrjit.dll
β coreclr.dll
β createdump.exe
β hostfxr.dll
β hostpolicy.dll
β libclang.dll
β libClangSharp.dll
β Microsoft.CSharp.dll
β Microsoft.DiaSymReader.Native.amd64.dll
β Microsoft.VisualBasic.Core.dll
β Microsoft.VisualBasic.dll
β Microsoft.Win32.Primitives.dll
β Microsoft.Win32.Registry.dll
β mscordaccore.dll
β mscordaccore_amd64_amd64_7.0.523.17405.dll
β mscordbi.dll
β mscorlib.dll
β mscorrc.dll
β msquic.dll
β Tester.deps.json
β Tester.dll
β Tester.exe
β Tester.pdb
β Tester.runtimeconfig.json
β netstandard.dll
// Almost all System.*dlls follow here
β System.*.dll
Note how this has an exe
under the specific runtime folder and all the dlls
next to it.
As far as I can tell this means the runtime.json
way of mapping runtime
identifier specific packages only works if you define a hard-coded specific
runtime identifier in the program you want to run. Which is incredibly annoying
if you want to build and deploy runtime agnostic applications. E.g. if we wanted
to deploy a win-x86
+ win-x64
single exe. How is that supposed to work then?
Am I getting this wrong?
Let's try a hack. Adding the RID specific package to the project. That is add
<PackageReference Include="libclang.runtime.win-x64" Version="16.0.0" />
to
the project. Run it from VS and then it now runs fine. Right, so in some ways
this works fine if we add the RID specific packages explicitly.
Still how does this work with regards to testing and if you use MSTest for both
x86 and x64 testing? Let's add a unit test project and reference the tester
console project, and copy code from above unit test in Program.cs
into this
project. Now if we run the unit test with Processor Architecture for AnyCPU
Projects set to Auto
. If we change this to x86
and it will fail with
the same exception as before:
System.DllNotFoundException: Unable to load DLL 'libclang' or one of its dependencies:
The specified module could not be found. (0x8007007E)
Interestingly, in the output we will get:
*****IMPORTANT*****
Failed to resolve libclang.
If you are running as a dotnet tool, you may need to manually copy the appropriate DLLs
from NuGet due to limitations in the dotnet tool support.
Please see https://github.com/dotnet/clangsharp for more details.
*****IMPORTANT*****
Note that the RID is win10-x86
in this case if logged e.g. with
log(RuntimeInformation.RuntimeIdentifier);
. If we select x64
it is
win10-x64
and the test succeeds, but only because we added the RID specific
libclang.runtime.win-x64
package to the project.
In https://github.com/dotnet/ClangSharp/issues/118#issuecomment-598305888 this issue is expanded upon with the comment by Tanner Gooding:
The simple fix for now is to add
<RuntimeIdentifier Condition="'$(RuntimeIdentifier)' == '' AND '$(PackAsTool)' != 'true'">$(NETCoreSdkRuntimeIdentifier)</RuntimeIdentifier>
to your project (under a PropertyGroup), unfortunately because of the way NuGet restore works, we can't just add this to a build/*.targets in the ClangSharp nuget package.The issue is essentially that libClang and libClangSharp just contain a runtime.json file which point to the real packages. This was done to avoid users needing to download hundreds of megabytes just to consume ClangSharp (when they only need one of the native binaries most often). You can see some more details on the sizes here: #46 (comment), noting that that is the size of the compressed NuGet.
I had thought this was working for dev scenarios where the RID wasn't specified, but it apparently isn't. I'll log an issue on NuGet to see if this is something that can be improved.
I wonder whether this actually works for the case of switching processor architecture in VS or similar? Let's try adding it to the unit tests project and remove the RID specific package from the console project. Hence we have console project:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net7.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="ClangSharp" Version="16.0.0" />
</ItemGroup>
</Project>
and unit test project:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net7.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
<IsPackable>false</IsPackable>
<RuntimeIdentifier Condition="'$(RuntimeIdentifier)' == '' AND '$(PackAsTool)' != 'true'">$(NETCoreSdkRuntimeIdentifier)</RuntimeIdentifier>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.3.2" />
<PackageReference Include="MSTest.TestAdapter" Version="2.2.10" />
<PackageReference Include="MSTest.TestFramework" Version="2.2.10" />
<PackageReference Include="coverlet.collector" Version="3.1.2" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\Tester\Tester.csproj" />
</ItemGroup>
</Project>
First time you then try to build this you will get a well-known error:
Assets file 'TesterUnitTests\obj\project.assets.json' doesn't have a target for 'net7.0/win-x64'.
Ensure that restore has run and that you have included 'net7.0' in the TargetFrameworks for your project.
You may also need to include 'win-x64' in your project's RuntimeIdentifiers.
So restore and build again. Let's try running x86
unit tests in VS. This
succeeds but the RID is actually now win10-x64
, so we can now no longer run or
debug x86
tests from Visual Studio?
Let's first try to define test running via a script test-x86-x64.ps1
:
#!/usr/bin/env pwsh
Write-Host "Testing Debug X86"
dotnet test --nologo -c Debug -- RunConfiguration.TargetPlatform=x86
Write-Host "Testing Release X86"
dotnet test --nologo -c Release -- RunConfiguration.TargetPlatform=x86
Write-Host "Testing Debug X64"
dotnet test --nologo -c Debug -- RunConfiguration.TargetPlatform=x64
Write-Host "Testing Release X64"
dotnet test --nologo -c Release -- RunConfiguration.TargetPlatform=x64
For x86
this will then fail with:
Test run detected DLL(s) which would use different framework and platform versions. Following DLL(s) do not match current settings, which are .NETCoreApp,Version=v7.0 framework and X86 platform.
TesterUnitTests.dll would use Framework .NETCoreApp,Version=v7.0 and Platform X64.
again this isn't great. We need to be able to run both x64 and x86 without having to go through hoops.
Perhaps if we add both win-x64
and win-x86
to a RuntimeIdentifiers
property instead? So change
<RuntimeIdentifier Condition="'$(RuntimeIdentifier)' == '' AND '$(PackAsTool)' != 'true'">$(NETCoreSdkRuntimeIdentifier)</RuntimeIdentifier>
<RuntimeIdentifiers>win-x64;win-x86</RuntimeIdentifiers>
then run test-x86-x64.ps1
. Now everything fails with the same exception:
System.DllNotFoundException: Unable to load DLL 'libclang' or one of its dependencies:
The specified module could not be found. (0x8007007E)
According to https://learn.microsoft.com/en-us/dotnet/core/project-sdk/msbuild-props#runtimeidentifiers I should have defined the RIDs correctly. An example from there is:
<PropertyGroup>
<RuntimeIdentifiers>win10-x64;osx.10.11-x64;ubuntu.16.04-x64</RuntimeIdentifiers>
</PropertyGroup>
Okay, perhaps running tests then need to be done differently and not with
the RunConfiguration.TargetPlatform
property? Let's try to run the tests
with --runtime
instead in a new script test-x86-x64-rid.ps1
:
#!/usr/bin/env pwsh
Write-Host "Testing Debug win-x86"
dotnet test --nologo -c Debug --runtime win-x86
Write-Host "Testing Release win-x86"
dotnet test --nologo -c Release --runtime win-x86
Write-Host "Testing Debug win-x64"
dotnet test --nologo -c Debug --runtime win-x64
Write-Host "Testing Release win-x64"
dotnet test --nologo -c Release --runtime win-x64
Then the tests succeed, albeit with the annoying warnings below.
C:\Program Files\dotnet\sdk\7.0.400-preview.23274.1\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.targets(1142,5):
warning NETSDK1179: One of '--self-contained' or '--no-self-contained' options are required when '--runtime' is used. [TesterUnitTests\TesterUnitTests.csproj]
C:\Program Files\dotnet\sdk\7.0.400-preview.23274.1\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Sdk.targets(1142,5):
warning NETSDK1179: One of '--self-contained' or '--no-self-contained' options are required when '--runtime' is used. [Tester.csproj]
why do I need to specify whether to be self-contained or not when I am just running tests? I am not publishing?
And are the tests really running x86 as expected? To test this I add two simple test:
[TestMethod]
public void X86() => Assert.AreEqual("win10-x86", RuntimeInformation.RuntimeIdentifier);
[TestMethod]
public void X64() => Assert.AreEqual("win10-x64", RuntimeInformation.RuntimeIdentifier);
and run the tests again. On win-x86
the X64
test fails as expected:
Assert.AreEqual failed. Expected:<win10-x64>. Actual:<win10-x86>.
and vice versa on win-x64
:
Assert.AreEqual failed. Expected:<win10-x86>. Actual:<win10-x64>.
so at least that works as expected.
Let's try running these tests from Visual Studio again. First, by setting
processor architecture to x86
. All tests except x86
fail, so this does switch the
runtime identifier to win10-x86
, but it does not fix the libclang
problem.
System.DllNotFoundException: Unable to load DLL 'libclang' or one of its dependencies:
The specified module could not be found. (0x8007007E)
so even though RIDs are now specified this doesn't work when running tests from
VS? Switching to x64
in VS and then only X64
test passes, and still the
libclang
dll cannot be found, so now this doesn't work either. The difference
apparently being there is now multiple RIDs, not just one.
Only way I think this can then be resolved is to actually explicitly add those RID specific runtime packages after all then so console project looks like:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net7.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="libtorch-cuda-11.7-win-x64" Version="2.0.1.1" />-->
<PackageReference Include="ClangSharp" Version="16.0.0" />
<PackageReference Include="libclang.runtime.win-x64" Version="16.0.6" />
<PackageReference Include="libclang.runtime.win-x86" Version="16.0.6" />
</ItemGroup>
</Project>
Re-running the unit tests and now libclang
can be loaded and that test
succeeds. Let's try command line too and it's the same.
So after all this, it seems like the runtime.json
way of packaging native
libraries has it's set of challenges, you basically end up having explicitly add
the RID specific packages anyway if you target multiple RIDs. In the process you
then end up implicitly forcing the Any CPU build to no longer be frame dependent
but self-contained? This is all very confusing and hard to understand and not
the least convey to other developers.
This entire space has a large number of issues and there isn't any good or "official" way to do things. Even runtime.json
is itself a largely undocumented feature.
ClangSharp is doing it the way it is primarily because of NuGet package size limits, but also because no one wants to download a single 256MB or larger package when they only need a 32MB subset of it.
Multiple issues, many of which you linked to in the OP, exist that track the general problem space.
Thank you for filling issue. I hope that the volume of these issues will make us to do something about the native dependencies packaging scenario. I agree that the current experience is very poor.
the runtime.json trick does not appear to work when running unit tests from inside Visual Studio
This looks like a bug to me. https://github.com/microsoft/vstest/ would be a better place to discuss this specific issue.
I read somewhere (can't find or remember where) that for .NET 8 it is considered to force a specific RID on build?
https://github.com/dotnet/sdk/issues/23540 is the main tracking issue for this change in the default behavior. The change was mentioned in .NET blog posts where you have probably seen it. It addresses the confusing coupling of RID-specific and self-contained that you have touched on.
@mhutch We have discussed the poor experience of using NuGet to distribute native dependencies some time ago. Do you have any updates that you can share?
@jkotas @tannergooding quick question, when using runtimes/<RID>/native
defined packages the native libraries are copied to this folder in output (on build, not publish), to then actually be able to run and have these libraries (or their dependencies) be loaded it appears I have to manually add this directory to the PATH
environment variable (before these are loaded), does that seem right?
It does not appear to be needed for libclang
so this might be specific for the library I am using, so the question might be does .NET in any way setup or ensure the runtimes/<RID>/native
directory is added to path or dll directories (e.g. Set/AddDllDirectory
)? Note that AddDllDirectory
did not actually work for this library. PATH was the only thing I could get working. And it's when it tries to load it's dependencies and so on.
When I use NativeLibrary.Load
with DllImportSearchPath.SafeDirectories
then AddDllDirectory
works, but I am not in charge of loading the native library here. In any case, it seems one has to do something here, and my issue is then I have to "guess" on which "RIDs" to add to PATH. Can't just use CORRECTION: RuntimeInformation.RuntimeIdentifier
directly (i.e. it is win10-x64
). And there is some clear guidance saying one should not try to parse or break up the runtime identifier one self...AddDllDirectory
not needed when manually loading via NativeLibrary.Load
. But as I say I am not in charge of loading these library dependencies, they are loaded from initial native library. How is one to then ensure the correct RIDs are added to path or similar? (Can't just use RuntimeInformation.RuntimeIdentifier
directly (i.e. it is win10-x64
, but library is in win-x64
).
If there are multiple native libraries that depend on each other, it is up to them to make that work. Linking these libraries with the correct /DEPENDENTLOADFLAG
is the best option. DllImportSearchPath.UseDllDirectoryForDependencies
and AddDllDirectory
in the calling code work too. I would stay away from modifying PATH
to make this work.
How is one to then ensure the correct RIDs are added to path or similar?
You can use the list of paths from AppDomain.CurrentDomain.GetData(βNATIVE_DLL_SEARCH_DIRECTORIESβ) to get the list of directories where native libraries are located.
@jkotas thanks for the replies.
Linking these libraries with the correct /DEPENDENTLOADFLAG is the best option.
I am not the builder of these libraries, merely the packager. Hence, have no control over linker options. Or library behavior.
DllImportSearchPath.UseDllDirectoryForDependencies and AddDllDirectory
I am not directly in charge of loading these libraries and would very much like to avoid it. The situation should be somewhat like the below. Where native libs are in runtimes/RID/native
where RID could be different. Problem is NtvLibA
loads dependent libraries maybe manually. I have no control over this.
flowchart LR
A[MgdExe] -->|Uses| B[MgdLib]
B -->|P/Invoke| C(NtvLibA)
C -->|LoadLibrary| D[NtvLibB]
Since AddDllDirectory
does not appear to cascade this does not work for this case (without manually loading the native dlls myself). SetDllDirectory
does but this only allows one directory and has the well known issues of overriding previous calls.
Hence, I am as far as I can tell I am left with just one option, changing PATH
. There are some mentions of changing app manifest but could not find any resource on how this could work for these native libraries, if that is an option happy to hear about that? But that would then also be a Windows only solution.
Actually, AddDllDirectory
combined with SetDefaultDllDirectories(LOAD_LIBRARY_SEARCH_DEFAULT_DIRS)
appears to work. (Had some issue with manifest getting in the way). Perhaps, the best option so far.
For a published WPF app win-x64
for example, we still have an issue around WPF native dlls, that have been moved to a sub-directory, having to be manually loaded at program start or app will crash with DllNotFoundException
this is despite having setup AddDllDirectory
and SetDefaultDllDirectories
and I am wondering why this is? (Perhaps a bit off topic but relates to native library packaging and dependencies).
// Try manually loading the .NET WPF native library dependencies
TryManuallyLoad("vcruntime140_cor3");
TryManuallyLoad("wpfgfx_cor3");
TryManuallyLoad("PresentationNative_cor3");
TryManuallyLoad("D3DCompiler_47_cor3");
Sorry to spam and bother you guys again, but I have yet another issue that I am scratching my head over. In the application (WPF - .NET 6) where this is to be used, we also have a dependency on an old .NET Fx assembly that is situated in the GAC. We find and load this via fusion i.e. we load
var fusionFullPath = Environment.Is64BitProcess
? @"C:\Windows\Microsoft.NET\Framework64\v4.0.30319\fusion.dll"
: @"C:\Windows\Microsoft.NET\Framework\v4.0.30319\fusion.dll";
NativeLibrary.Load(fusionFullPath);
then use something like:
/// <summary>
/// Gets an assembly path from the GAC given a partial name.
/// </summary>
/// <param name="name">An assembly partial name. May not be null.</param>
/// <returns>
/// The assembly path if found; otherwise null;
/// </returns>
public static string GetAssemblyPath(string name)
{
if (name == null)
{ throw new ArgumentNullException(nameof(name)); }
var hr = CreateAssemblyCache(out var assemblyCache, 0);
if (hr >= 0)
{
var assemblyInfo = new AssemblyInfo();
assemblyInfo.cchBuf = 1024; // should be fine...
assemblyInfo.currentAssemblyPath = new string('\0', assemblyInfo.cchBuf);
hr = assemblyCache.QueryAssemblyInfo(0, name, ref assemblyInfo);
if (hr >= 0)
{
return assemblyInfo.currentAssemblyPath;
}
}
return null;
}
[ComImport, InterfaceType(ComInterfaceType.InterfaceIsIUnknown), Guid("e707dcde-d1cd-11d2-bab9-00c04f8eceae")]
interface IAssemblyCache
{
void Reserved0();
[PreserveSig]
int QueryAssemblyInfo(int flags, [MarshalAs(UnmanagedType.LPWStr)] string assemblyName, ref AssemblyInfo assemblyInfo);
}
[StructLayout(LayoutKind.Sequential)]
struct AssemblyInfo
{
public int cbAssemblyInfo;
public int assemblyFlags;
public long assemblySizeInKB;
[MarshalAs(UnmanagedType.LPWStr)]
public string currentAssemblyPath;
public int cchBuf; // size of path buf.
}
// On .NET 5+ we get the following:
// System.DllNotFoundException: Unable to load DLL 'fusion.dll' or one of its dependencies: The specified module could not be found. (0x8007007E)
// https://github.com/dotnet/core/issues/3048
// To fix this we use NativeLibrary.Load in the static constructor above.
[DllImport("fusion.dll")]
static extern int CreateAssemblyCache(out IAssemblyCache ppAsmCache, int reserved);
to find the path of that .NET assembly. We then override:
AppDomain.CurrentDomain.AssemblyResolve
to handle this. This works fine and has worked without issue.
But after using SetDefaultDllDirectories(LOAD_LIBRARY_SEARCH_DEFAULT_DIRS)
and AddDllDirectory
this no longer works and the GAC assembly can no longer be loaded. It just fails. Going back to changing the PATH environment variable and it works. Why is that?
This is all very involved, but such is the real-world of industrial computer vision/AI where we have a lot of dependencies that are often out of our control. External code might be old, might be mixed-mode assemblies and so on.
@jkotas it seems to me that NATIVE_DLL_SEARCH_DIRECTORIES
is not used when an app is published as self-contained is that correct?
I still have not found a solution above. Problem is calling SetDefaultDllDirectories
will completely disrupt the normal Dynamic link library search order which then also means that directories added to PATH won't be searched. Since the mixed-mode assembly is third party library that puts its native dependencies in multiple directories found via PATH env.var. this then means this cannot be loaded.
Using PATH
cannot be used because for some reason some Windows installations have one of the native libraries in C:\Windows\system32
! And that is searched before PATH. (this is onnxruntime.dll
).
Hence, I am stuck in trying to find a good solution. SetDllDirectory
would be good, since this is added before C:\Windows\system32
and this still uses normal search order and hence propagates down to PATH if not found before. But this only allows one directory, is global and so on. This is what we have used for years but know with nuget packages that ship native libraries per runtimes/<RID>/native
this means more directories than one when framework-dependent.
Using NATIVE_DLL_SEARCH_DIRECTORIES
only works for .NET loaded native libs and only works if not published/self-contained. This still leaves loading transitive native libraries.
The more I read the more confused I get here. deps.json
could be an option but docs are not understandable to me.
it seems to me that NATIVE_DLL_SEARCH_DIRECTORIES is not used when an app is published as self-contained is that correct?
That is not correct. NATIVE_DLL_SEARCH_DIRECTORIES
is used for self-contained apps.
Self-contained apps are always RID specific. Portable self-contained apps do not exist.
RID specific apps (including self-contained RID specific apps) should have everything in the same directory. They should not be hitting any of the problems with dependencies spread over multiple directories.
That is not correct. NATIVE_DLL_SEARCH_DIRECTORIES is used for self-contained apps.
Should I then not be able to modify this first thing in startup (Main) like (where archDir
is an absolute path to a directory containing the dlls):
AppDomain.CurrentDomain.SetData("NATIVE_DLL_SEARCH_DIRECTORIES", archDir);
and then per Unmanaged (native) library probing is should load from that before anything else? This does not appear to work (for self-contained), it will load from C:\Windows\System32
1) Check if the supplied library name represents an absolute or relative path.
1) If the name represents an absolute path, use the name directly for all subsequent operations. Otherwise, use the name and create platform-defined combinations to consider. Combinations consist of platform specific prefixes (for example, lib
) and/or suffixes (for example, .dll
, .dylib
, and .so
). This is not an exhaustive list, and it doesn't represent the exact effort made on each platform. It's just an example of what is considered.
1) The name and, if the path is relative, each combination, is then used in the following steps. The first successful load attempt immediately returns the handle to the loaded library.
- Append it to each path supplied in the `NATIVE_DLL_SEARCH_DIRECTORIES` property and attempt to load.
- If <xref:System.Runtime.InteropServices.DefaultDllImportSearchPathsAttribute> is either not defined on the calling assembly or p/invoke or is defined and includes `DllImportSearchPath.AssemblyDirectory`, append the name or combination to the calling assembly's directory and attempt to load.
- Use it directly to load the library.
Self-contained apps are always RID specific. Portable self-contained apps do not exist. RID specific apps (including self-contained RID specific apps) should have everything in the same directory. They should not be hitting any of the problems with dependencies spread over multiple directories.
I understand this and perhaps I did not explain it very well. We need to support BOTH framework-dependent deployment (incl. local debugging in VS) dotnet build
AND self-contained deployment dotnet publish
.
And as I tried to write for self-contained deployment it is a requirement (from us/customers etc.) that the native libraries are located/moved in a sub-directory (e.g. x64
) from the exe
itself. This is not a new requirement we have been doing this for years. We have +3 GB native library dependencies. Additionally, we have been deploying in a way that allows us to put multiple exe
s in one location incl. both 32-bit/64-bit executables in same directory. Native libraries then in sub-directories.
In any case, we naturally must support developers being able run the application from visual studio or whatever with F5 for debugging. With nuget packages following runtimes/<RID>/native
this means dlls can be spread out on many directories now (for framework-dependent dotnet build
). This means SetDllDirectory
won't work for that scenario (due to transitive dependencies). This has been our go to solution before. But requires all dlls in one location and that nothing else calls this with a different path, which is pretty brittle.
Perhaps to make this more clear I have tried to show the layout for the different scenarios below.
Framework-dependent (dotnet build
)
APP.exe // AnyCPU, no Platform, no RuntimeIdentifier
runtimes/win-x64/native/*.dll // From new nuget packages
runtimes/win-x86/native/*.dll
x64/*.dll // From existing/"legacy" nuget packages using `.target`
x86/*.dll
Self-contained (dotnet publish -r RID --self-contained
)
APP-win-x64.exe // RID specific
APP-win-x86.exe // RID specific
TOOL-win-x86.exe // RID specific
x64/*.dll // Consolidated dlls for publishing
x86/*.dll
This is of course a demonstrative example. Both of these "layouts" can be used both locally and on production machines. For many different reasons. The framework-dependent scenario cannot be supported with just SetDllDirectory
since this fails for transitive dependencies.
Note this is all Windows currently, but given I am also trying to ship some of our open source dependencies in nuget packages I am also trying to play nice in the community and publish these in a way that could be used by all. Incl. trying to support the different kinds of deployments/usages so it just works.
Hence, I am trying to weigh and understand options here. Note we do not have full control over all our dependencies (one being ONNX runtime for example) and hence not on DllImport
definitions, nor on how some native libraries are loading other native libraries. I fully understand this is somewhat outside the purview of .NET as such, but the way the initial native library is loaded and with which options can help here.
NATIVE_DLL_SEARCH_DIRECTORIES
property is considered read-only. Updating it using AppDomain.SetData
won't be respected.
https://learn.microsoft.com/en-us/dotnet/core/dependency-loading/loading-unmanaged#pinvoke-load-library-algorithm is a more detailed description of the native library loading algorithm. You should be able to call NativeLibrary.SetDllImportResolver
for the assemblies that you want to control loading the native dependencies for. It will give you full control over how the native dependency is loaded in the callback: You can call LoadLibraryEx
with any flags, you can call AddDllDirectory
/RemoveDllDirectory
, you can return handle that you have pre-loaded earlier, ... .
NATIVE_DLL_SEARCH_DIRECTORIES property is considered read-only. Updating it using AppDomain.SetData won't be respected.
Well that explains it of course π
Shouldn't it throw on set then or be documented? https://learn.microsoft.com/en-us/dotnet/core/dependency-loading/default-probing mentions that .deps.json
can be used then, but are there any examples on that?
call NativeLibrary.SetDllImportResolver for the assemblies that you want to control loading the native dependencies for. It will give you full control over how the native dependency is loaded in the callback: You can call LoadLibraryEx with any flags, you can call AddDllDirectory/RemoveDllDirectory, you can return handle that you have pre-loaded earlier, ... .
Yes I have been and am looking into this. It's just seems incredibly complicated compared to setting:
SetDllDirectory(Environment.Is64BitProcess ? @"x64" : @"x86");
which is basically what we did before, and which works in concert with "normal search order". I am concerned with ending up with the extremes that TorchSharp had to go for to get a good out-of-the-box experience (something I applaud), see:
Having to define this per assembly effectually couples a lot of things together and is harder to reason about and configure for different kinds of use cases. And it doesn't solve transitive library dependencies. Don't want to load things that aren't necessarily needed and so on.
In many ways, what appears missing to me is proper OS (Windows) support for adding directories early in search order that do not then completely replace the search order like SetDefaultDllDirectories
appears to do. But that's a pipe dream or concern.
Just to recap my own thoughts here, there are the following options:
PATH
environment variable, change first thing in Main
- won't work for incorrect native libraries in C:\Windows\System32
like onnxruntime.dll
since last in search order.SetDllDirectory
- works wonderfully since early in search order (e.g. before System32) but only if all local dll dependencies in one directory which isn't (currently) the case for dotnet build
/VS/framework-dependent usage. Perhaps could consolidate on runtimes/win-x64/native
for dotnet build
and x64
for dotnet publish
(with x86
for 32-bit), but requires ALL nuget packages with native libraries to follow same RID. AND that SetDllDirectory
only called with same directory for all code everywhere.SetDefaultDllDirectories
(+ AddDllDirectory
) - overtakes normal library loading and dll search order, which means transitive native dependencies present only in PATH
are not loaded e.g. for third-party SDKs (e.g. mixed-mode assembly). In principle one could add each PATH directory with AddDllDirectory
but AddDllDirectory
does not guarantee anything with regards to order. PATH
environment variable does as far as I know.AddDllDirectory
- without SetDefaultDllDirectories
only works for "first order" dependencies where one can control search path behavior.NATIVE_DLL_SEARCH_DIRECTORIES
- read-only and hence not possible for self-contained/published scenario with dlls in sub-directory. Can be used to find probing directories when in framework-dependent mode, though, and add these with AddDllDirectory
or similar.NativeLibrary.SetDllImportResolver
- again only works for native library dependencies that .NET is in control of loading. Requires setting up for all possible assemblies with p/invoke or similar. Possibly forcing load of assemblies (since have to get Assembly
for that to call SetDllImportResolver
) that might not be needed or having to set this up at exact usages of this. Also for unit test scenarios or similar. (this goes for all of course).LoadLibraryEx
- manually load libraries even ones that may or may not be actually used if that is hard to determine up front (note some native libraries have a "plugin" model). TorchSharp
approach. With + 3GB dlls seems like a waste and incredibly coupled, brittle, hard to maintain etc.Or any combination of the above. This is harder than it should be. π
Shouldn't it throw on set then or be documented?
It is documented in AppDomain.SetData: The cache automatically contains predefined system entries that are inserted when the application domain is created. You cannot insert or modify system entries with this method. A method call that attempts to modify a system entry has no effect; the method does not throw an exception.
cc: @tannergooding @richlander @jkotas
This is yet another issue regarding how to best author native library nuget packages and define, build, test, publish deploy applications that consume these. I have tried hard to wrap my head about this by reading many issues and studying existing packages. I have a particular need that is similar to
TorchSharp
with massive native libraries that not only need to be split into fragments but also where if possible it would be best only to "download" the runtime identifier (RID) specific packages needed for local development. (But on windows that local development often means BOTH x86 and x64 in our case).Below I wrote a walk-through I did of using
ClangSharp
(in excessive detail for reference) and the many questions that it raised for me compared to how I am used to working with this (based on our own way of authoring native library packages that are explicitly copied to sub-directories (x64
,x86
) alongsideexe
and with those directories then added at runtime based on the process arch/os/system to dll directories i.e. viaAddDllDirectory
. Having something "custom" is a maintenance issue of course, but also an on-boarding issue. Using documented best practices would be best, but as far as I can tell there are none?In any case, at the end of the walk-through I encounter the problem that when specifying multiple RIDs i.e.
then the
runtime.json
trick does not appear to work when running unit tests from inside Visual Studio. I have to explicitly add the RID specific nuget packages anyway, so I then wonder how exactly is one supposed to author nuget packages to be able to support running multiple RIDs (in this case solely interested in win-x86 and win-x64 for now) with full support for it as usual in VS and other tools? We need to be able to debug and run from VS?And how do you switch which RID you run with when F5 running in VS?
Should I simply accept that the
runtime.json
way is too flawed and explicitly reference all needed nuget packages? Would this then avoid the need to specify RIDs? Which also has issues with "forcing" self-contained (we don't want that), in fact we'd like to simply be able to deploy/copy-paste build output as something like:where the app is not RID specific (framework-dependent of course). And this should work on both win-x86/winx64. This is what we have now and what works. Our developers are used to this. But it's based on native library nuget packages that explicitly copy their native library contents to those folders and of course referencing all those RID specific ones. I had hoped perhaps one could avoid the RID specific referencing, but that does not seem to work "smoothly". Which I'd guess then means the whole
runtime.json
is not the way to go.Secondly, I think I read somewhere (can't find or remember where) that for .NET 8 it is considered to force a specific RID on build? I can see given my experience below why one might consider doing that, but that would then raise other issues such as losing what used to be a core tenant (IMHO) of .NET which is that a build output (not publish) is RID agnostic. Would that be lost then?
All in all, to solve these issues I have to author my own little tool for packaging the native libraries, consider all the issues around consumption, testing etc. And after going through all this I am still left with feeling rather lost π I still don't know exactly what is the best solution here. And the packages I am creating are intended to be published for the public, e.g. so I can publish the revived CNTK packages I've made on nuget.org for example.
On top of this we still want to support publishing RID specific applications, but then we don't want native libraries embedded in single file, there is an option for that which is great, but then we want those dlls in sub-folder, not directly next to the
exe
, which means we have to hack around that in MSBuild and then face issues with mixed-mode assemblies etc. Yes, we also have those which also makes things very interesting.ML/AI isn't going away. For each new CUDA or whatever release the native libraries double in size (minimum!). Easy authoring and consumption of those would be great, but I am sure also won't be solved in the immediate future, I need to know what to do now?
The walk-through will come as the next comment.
Links
runtime.json
https://natemcmaster.com/blog/2016/05/19/nuget3-rid-graph/dotnet-native
template git repository https://github.com/Mizux/dotnet-native