Closed ChengJin01 closed 1 month ago
@zl-wang, is there any way to reach out to the AIX team to understand the details as to how dlopen
works in such case?
@ChengJin01 this has been always like that on AIX. when there are multiple members in an archive, dlopen needs to name the specific member you want to load. I am going to look up man-page of dlopen and attach here later.
Dynamically loads a module into the calling process.
The dlopen subroutine loads the module specified by FilePath into the executing process's address space. Dependents of the module are automatically loaded as well. If the module is already loaded, it is not loaded again, but a new, unique value will be returned by the dlopen subroutine.
The dlopen subroutine is a portable way of dynamically loading shared libraries. It performs C++ static initialization of the modules that it loads, like the loadAndInit subroutine does.
The value returned by the dlopen might be used in subsequent calls to dlsym and dlclose. If an error occurs during the operation, dlopen returns NULL.
If the main application was linked with the -brtl option, then the runtime linker is invoked by dlopen. If the module being loaded was linked with runtime linking enabled, both intra-module and inter-module references are overridden by any symbols available in the main application. If runtime linking was enabled, but the module was not built enabled, then all inter-module references will be overridden, but some intra-module references will not be overridden.
If the module being opened with dlopen or any of its dependents is being loaded for the first time, initialization routines for these newly-loaded routines are called (after runtime linking, if applicable) before dlopen returns. Initialization routines are the functions specified with the -binitfini: linker option when the module was built. (See the ld command for more information about this option.)
After calling the initialization functions for all newly-loaded modules, C++ static initialization is performed. If you call the dlopen subroutine from within an initialization function or a C++ static initialization function, modules loaded by the nested dlopen subroutine might be initialized before completely initializing the originally loaded modules.
If a dlopen subroutine is called from within a binitfini function, the initialization of the current module is abandoned for other modules.
The LIBPATH or LD_LIBRARY_PATH environment variables can be used to specify a list of directories in which the dlopen subroutine searches for the named module. The running application also contains a set of library search paths that were specified when the application was linked. The dlopen subroutine searches the modules based on the mechanism that the load subroutine defines, because the dlopen subroutine internally calls the load subroutine with the L_LIBPATH_EXEC flag.
Upon successful completion, dlopen returns a value that can be used in calls to the dlsym and dlclose subroutines. The value is not valid for use with the loadbind and unload subroutines.
If the dlopen call fails, NULL (a value of 0) is returned and the global variable errno is set. If errno contains the value ENOEXEC, further information is available via the dlerror function.
buried deep in the above in Flags
section:
RTLD_MEMBER The dlopen subroutine can be used to load a module that is a member of an archive. The L_LOADMEMBER flag is used when the load subroutine is called. The module name FilePath names the archive and archive member according to the rules outlined in the load subroutine.
The problem with libclang.a
is that these symbols (required in jextract) don't belong to any member of the archive:
[304] 0x11041e898 .data EXP DS SECdef [noIMid] clang_getRemappings
[305] 0x11041e8b0 .data EXP DS SECdef [noIMid] clang_getRemappingsFr
[306] 0x11041e8c8 .data EXP DS SECdef [noIMid] clang_remap_getNumFil
[307] 0x11041e8e0 .data EXP DS SECdef [noIMid] clang_remap_getFilena
[308] 0x11041e8f8 .data EXP DS SECdef [noIMid] clang_remap_dispose
[309] 0x11041e910 .data EXP DS SECdef [noIMid] clang_getBuildSession
[310] 0x11041e928 .data EXP DS SECdef [noIMid] clang_VirtualFileOver
[311] 0x11041e940 .data EXP DS SECdef [noIMid] clang_VirtualFileOverMapping
[312] 0x11041e958 .data EXP DS SECdef [noIMid] clang_VirtualFileOverSensitivity
[313] 0x11041e970 .data EXP DS SECdef [noIMid] clang_VirtualFileOverBuffer
[314] 0x11041e988 .data EXP DS SECdef [noIMid] clang_free
[315] 0x11041e9a0 .data EXP DS SECdef [noIMid] clang_VirtualFileOver
[316] 0x11041e9b8 .data EXP DS SECdef [noIMid] clang_ModuleMapDescri
[317] 0x11041e9d0 .data EXP DS SECdef [noIMid] clang_ModuleMapDescrimeworkModuleName
[318] 0x11041e9e8 .data EXP DS SECdef [noIMid] clang_ModuleMapDescrirellaHeader
[319] 0x11041ea00 .data EXP DS SECdef [noIMid] clang_ModuleMapDescrioBuffer
[320] 0x11041ea18 .data EXP DS SECdef [noIMid] clang_ModuleMapDescrie
[321] 0x110434a20 .data EXP DS SECdef [noIMid] clang_Cursor_isNull
[322] 0x110434a38 .data EXP DS SECdef [noIMid] clang_getNullRange
[323] 0x110434a50 .data EXP DS SECdef [noIMid] clang_getNullLocation
[324] 0x110434a68 .data EXP DS SECdef [noIMid] clang_getFileLocation
[325] 0x110434a80 .data EXP DS SECdef [noIMid] clang_getCursorUSR
[326] 0x110434a98 .data EXP DS SECdef [noIMid] clang_getCString
[327] 0x110434ab0 .data EXP DS SECdef [noIMid] clang_disposeString
[328] 0x110434ac8 .data EXP DS SECdef [noIMid] clang_getTypeDeclarat
[329] 0x110434af8 .data EXP DS SECdef [noIMid] clang_getRangeStart
[330] 0x110434b10 .data EXP DS SECdef [noIMid] clang_getRangeEnd
[331] 0x110434b28 .data EXP DS SECdef [noIMid] clang_getRange
[332] 0x110434b70 .data EXP DS SECdef [noIMid] clang_defaultDiagnosttions
[333] 0x110434b88 .data EXP DS SECdef [noIMid] clang_formatDiagnosti
[334] 0x1104c32f0 .data EXP DS SECdef [noIMid] clang_install_abortinl_error_handler
[335] 0x1104c38d8 .data EXP DS SECdef [noIMid] clang_createTranslati
[336] 0x1104e2f10 .data EXP DS SECdef [noIMid] clang_Cursor_getTrans
[337] 0x1104e2f28 .data EXP DS SECdef [noIMid] clang_Range_isNull
[338] 0x1104e3048 .data EXP DS SECdef [noIMid] clang_disposeTranslat
[339] 0x1104e3078 .data EXP DS SECdef [noIMid] clang_isInvalid
[340] 0x1104e3090 .data EXP DS SECdef [noIMid] clang_isDeclaration
[341] 0x1104e30a8 .data EXP DS SECdef [noIMid] clang_isReference
[342] 0x1104e30c0 .data EXP DS SECdef [noIMid] clang_isStatement
[343] 0x1104e30d8 .data EXP DS SECdef [noIMid] clang_isExpression
[344] 0x1104e30f0 .data EXP DS SECdef [noIMid] clang_isTranslationUn
[345] 0x1104e3108 .data EXP DS SECdef [noIMid] clang_isAttribute
[346] 0x1104e3120 .data EXP DS SECdef [noIMid] clang_createIndex
......
Does dlopen
work to handle them correctly?
that means you might need another shared-lib to satisfy the request. i am wondering how the executable was linked first. why can it be linked successfully, if there were missing symbols.
this has been always like that on AIX. when there are multiple members in an archive, dlopen needs to name the specific member you want to load. I am going to look up man-page of dlopen and attach here later.
I tried the following code but it ended up with a null handle.
#include <stdio.h>
#include <dlfcn.h>
int main(int argc, char **argv) {
void *handle;
handle = dlopen ("/usr/lib/libc.a(shr_64.o)", RTLD_MEMBER); <--- or RTLD_MEMBER | RTLD_LAZY
printf("handle = %p\n", handle);
dlclose(handle);
return 0;
}
that means you might need another shared-lib to satisfy the request. i am wondering how the executable was linked first. why can it be linked successfully, if there were missing symbols.
These libraries (including libclang.a
) are directly unpacked from https://github.com/llvm/llvm-project/releases/download/llvmorg-18.1.8/clang+llvm-18.1.8-powerpc64-ibm-aix-7.2.tar.xz (as required by jextract) in which these libraries are put together there.
Not specific to jdk23, not a new problem, not a blocker for jdk23, move it forward.
@JasonFengJ9 For 0.48, this issue will need to be resolved by the end of this week. What's the current state of this issue? Based on this issue's impact, do we need it to be fixed in 0.48 or can it be pushed to 0.49?
i have successfully built and run jextract on AIX (and Linux) for customer (Finanz Informatik). on AIX, there is a clang bug though (allocating 2TB memory). for official build, you might need to change the gradle build script to copy/extract libclang.so from libclang.a.
the bug still exists in latest/current version of clang. OpenXL team is investigating, tracked here: https://github.ibm.com/compiler/wyvern/issues/20642
This is not a new problem. As per https://github.com/eclipse-openj9/openj9/issues/19930#issuecomment-2353882872, the customer has a running jextract
.
Moving to 0.49.
i have successfully built and run jextract on AIX (and Linux) for customer (Finanz Informatik).
@zl-wang is there any OpenJ9 change involved? Do we still need this issue for further investigation?
no, i don't need to change anything in OpenJ9. If this issue was opened for the purpose of building jextract on AIX, i think it is better to be in jextract repository (in order to change the gradle script). otherwise, it can be closed.
Thanks @zl-wang this issue was opened for OpenJ9 support of AIX jextract
, since it can be addressed with the build script changes, closing it here.
Issue Number: 19930 Status: Closed Actual Components: comp:vm, project:panama, os:aix Actual Assignees: No one :( PR Assignees: No one :(
If this issue was opened for the purpose of building jextract on AIX, i think it is better to be in jextract repository (in order to change the gradle script).
Chatted with @zl-wang, the shared library .so
was extracted from the archive file, and renamed to libclang.so
manually.
Will propose a script change at https://github.com/openjdk/jextract.
The issue was detected when generating the jextract tool for FFI after the issue with native library loading (
.a
) was resolved (which is entirely different from the original problem being addressed in https://github.com/eclipse-openj9/openj9/issues/19344).As explained in https://github.com/openjdk/jextract, the generation of the jextract tool requires the LLVM libraries in place, part of which (e.g.
libclang.a
) must be loaded by JDK to exploit the native functions in building the tool where the loading failure occurs in callingdlopen
(see https://github.com/eclipse-openj9/openj9/issues/19344#issuecomment-2253523393 for details), which most likely happens to the code around there at https://github.com/eclipse-openj9/openj9-omr/blob/9083c8237ac215927ac55b5db256780132983136/port/aix/omrsl.c#L216).Technically, the existing code dealing with
dlopen
currently only works to load a simple shared object (extremely simple format) suffixed with.a
but fails to support a complex archive file combined with many shared objects (likelibc.a
orlibclang.a
in LLVM). To address the problem, we need to figure out howdlopen
works to load these archives at first, especially in the case oflibc.a
/libclang.a
.FYI: @TobiAjila, @pshipton, @JasonFengJ9, @zl-wang, @keithc-ca