dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.15k stars 4.71k forks source link

NativeLibrary.GetExport/TryGetExport on Linux don't offer the same semantics as dlsym() #94079

Open exoosh opened 11 months ago

exoosh commented 11 months ago

Description

NativeLibrary.GetExport throws a System.ArgumentNullException exception when passed in IntPtr.Zero to as handle.

This makes it impossible to get RTLD_DEFAULT behavior.

Reproduction Steps

Compile and run this:

using System.Runtime.InteropServices;

NativeLibrary.GetExport(IntPtr.Zero, "SomeFunction");

Expected behavior

Return function pointer or IntPtr.Zero.

Actual behavior

Throws exception System.ArgumentNullException

$ dotnet run /home/username/getexport-issue/bin/Debug/net6.0/getexport.dll
Unhandled exception. System.ArgumentNullException: Value cannot be null. (Parameter 'handle')
   at System.Runtime.InteropServices.NativeLibrary.GetExport(IntPtr handle, String name)
   at Program.<Main>$(String[] args) in /home/username/getexport-issue/Program.cs:line 3

Regression?

No idea

Known Workarounds

Call a DllImport-ed dlsym() directly with IntPtr.Zero for the handle.

As was pointed out in a comment here, another workaround exists by using:

var RTLD_DEFAULT = NativeLibrary.GetMainProgramHandle();

in place of IntPtr.Zero. Probably close enough.

Configuration

$ dotnet --version
7.0.113
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy
$ uname -a
Linux ubuntu 6.2.0-35-generic #35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Oct  6 10:23:26 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

I think it's not specific to this configuration, after looking at the runtime sources.

.NET is the one packaged and available on Ubuntu 22.04 without the .NET package repos.

Other information

dlsym() knows of two pseudo-handles which can be passed in as the first argument ("a "handle" of a dynamic loaded shared object"). These pseudo-handles are:

/* If the first argument of `dlsym' or `dlvsym' is set to RTLD_NEXT
   the run-time address of the symbol called NAME in the next shared
   object is returned.  The "next" relation is defined by the order
   the shared objects were loaded.  */
# define RTLD_NEXT      ((void *) -1l)

/* If the first argument to `dlsym' or `dlvsym' is set to RTLD_DEFAULT
   the run-time address of the symbol called NAME in the global scope
   is returned.  */
# define RTLD_DEFAULT   ((void *) 0)

(excerpt from dlfcn.h from GLIBC 2.35)

While passing (IntPtr)(-1) works fine and yields RTLD_NEXT behavior, passing IntPtr.Zero is simply treated as an error condition.

This makes sense for Win32 GetProcAddress but doesn't make sense for dlsym on Linux where various scopes for symbol lookup exist.

ghost commented 11 months ago

Tagging subscribers to this area: @dotnet/interop-contrib See info in area-owners.md if you want to be subscribed.

Issue Details
### Description `NativeLibrary.GetExport` throws a `System.ArgumentNullException` exception when passed in `IntPtr.Zero` to as `handle`. This makes it impossible to get `RTLD_DEFAULT` behavior. ### Reproduction Steps Compile and run this: ``` using System.Runtime.InteropServices; NativeLibrary.GetExport(IntPtr.Zero, "SomeFunction"); ``` ### Expected behavior Return function pointer or `IntPtr.Zero`. ### Actual behavior Throws exception `System.ArgumentNullException` ``` $ dotnet run /home/username/getexport-issue/bin/Debug/net6.0/getexport.dll Unhandled exception. System.ArgumentNullException: Value cannot be null. (Parameter 'handle') at System.Runtime.InteropServices.NativeLibrary.GetExport(IntPtr handle, String name) at Program.
$(String[] args) in /home/username/getexport-issue/Program.cs:line 3 ``` ### Regression? No idea ### Known Workarounds Call a `DllImport`-ed `dlsym()` directly with `IntPtr.Zero` for the `handle`. ### Configuration ``` $ dotnet --version 7.0.113 $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 22.04.3 LTS Release: 22.04 Codename: jammy $ uname -a Linux ubuntu 6.2.0-35-generic #35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Oct 6 10:23:26 UTC 2 x86_64 x86_64 x86_64 GNU/Linux ``` I think it's not specific to this configuration, after looking at the runtime sources. .NET is the one packaged and available on Ubuntu 22.04 _without_ the .NET package repos. ### Other information [`dlsym()`](https://man7.org/linux/man-pages/man3/dlsym.3.html) knows of two pseudo-handles which can be passed in as the first argument ("a "handle" of a dynamic loaded shared object"). These pseudo-handles are: ``` /* If the first argument of `dlsym' or `dlvsym' is set to RTLD_NEXT the run-time address of the symbol called NAME in the next shared object is returned. The "next" relation is defined by the order the shared objects were loaded. */ # define RTLD_NEXT ((void *) -1l) /* If the first argument to `dlsym' or `dlvsym' is set to RTLD_DEFAULT the run-time address of the symbol called NAME in the global scope is returned. */ # define RTLD_DEFAULT ((void *) 0) ``` (excerpt from `dlfcn.h` from GLIBC 2.35) While passing `(IntPtr)(-1)` works fine and yields `RTLD_NEXT` behavior, passing `IntPtr.Zero` is simply treated as an error condition. This makes sense for Win32 `GetProcAddress` but doesn't make sense for `dlsym` on Linux where various scopes for symbol lookup exist.
Author: exoosh
Assignees: -
Labels: `area-System.Runtime.InteropServices`
Milestone: -
huoyaoyuan commented 11 months ago

https://github.com/dotnet/runtime/issues/71881#issuecomment-1179570792

exoosh commented 11 months ago

#71881 (comment)

Thanks for pointing that out! I agree, that this is a viable workaround.

The linked issue has a few remarks though -- and it's closed so commenting there is impossible. One such remark is:

Given that NativeLibrary exists to abstract away platform differences, I don't think introducing a platform difference would be a good idea. [...] (on Windows it could enumerate loaded libraries and call TryGetExport on each.)

Agreed that it's there to gloss over those differences. Alas, seeing everything through Windows-tinted glasses in this case is the issue. Windows does in fact have the notion of "the module that created this process". What it does not have is the notion of a process-global symbol table. And that's where RTLD_DEFAULT and RTLD_NEXT come in. That said: the suggestion in parentheses makes little sense in the Windows context. That's no longer abstracting away platform differences, it's creating new semantics -- and arguably wouldn't be Interop anymore.

Another commenter states:

[...] I'm not convinced it is common enough [...]

And I am not sure how that's backed by data, if at all. Certainly those are valid and common use cases in the Linux world. Certain libraries outright rely on the fact that symbols are made globally available before they get loaded (see MKL in #93911).

Looking at the implementation and names like LoadLibraryCallbackStub and then at issues like #11901 really doesn't say "abstract away platform differences", it says "use this particular semantic on the target platform because it most closely resembles Windows behavior".

And don't get me wrong, I do appreciate many aspects of the NT platform and find myself defending its advantages against more fervent Linux aficionados on a regular basis. But abstracting platform differences in favor of one platform and at the expense of every other platform isn't exactly the spirit of cross-platform development. That's the spirit of "I'll port this fork-based Unix server to Windows" and then complaining it doesn't scale (because of differences in the expense for process creation, because Windows uses proactor pattern whereas Linux traditionally used reactor pattern etc.) -- just in reverse.

PS: I'll leave closing this ticket to someone else, because I don't want to bereave anyone of the ability to comment.

AaronRobinsonMSFT commented 11 months ago

Will try this again for .NET 9.

Note that https://github.com/dotnet/runtime/issues/71881 was left open to wait for feedback, Very little was received so it was closed. If we see a strong push then we can consider new APIs to reduce friction further but for now there doesn't seem to be enough to warrant a new API.

exoosh commented 11 months ago

Will try this again for .NET 9.

Note that #71881 was left open to wait for feedback, Very little was received so it was closed. If we see a strong push then we can consider new APIs to reduce friction further but for now there doesn't seem to be enough to warrant a new API.

@AaronRobinsonMSFT I totally agree that from the view of a C# developer this is probably a fringe problem. And workarounds exist and are now somewhat documented publicly as well.

jkotas commented 11 months ago

The NativeLibrary APIs have been designed as the least-common denominator between platforms. They do not provide access to the full set of Windows-specific options nor the full set of Unix-specific options for working with native libraries (that also differ between Unix variants). The design expects to manually PInvoke the OS-specific API to get access to the OS-specific functionality. It is a non-goal for .NET BCL APIs to expose all OS-specific features.

Having said that, we have ability to expose OS-specific features as .NET BCL APIs.