Open mratsim opened 1 year ago
Without passing the --noMain
flag we have the following result:
N_LIB_PRIVATE void PreMainInner(void) {
atmdotdotatsconstantineatsplatformsatsisaatscpuinfo_x86dotnim_Init000();
}
N_LIB_PRIVATE int cmdCount;
N_LIB_PRIVATE char** cmdLine;
N_LIB_PRIVATE char** gEnv;
N_LIB_PRIVATE void PreMain(void) {
atmdotdotatsdotdotatsdotdotatsdotdotatsdotchoosenimatstoolchainsatsnimminus1dot6dot12atslibatssystemdotnim_Init000();
PreMainInner();
}
N_LIB_PRIVATE N_CDECL(void, NimMainInner)(void) {
NimMainModule();
}
N_LIB_EXPORT N_CDECL(void, ctt_init_NimMain)(void) {
void (*volatile inner)(void);
PreMain();
inner = NimMainInner;
(*inner)();
}
N_LIB_PRIVATE void NIM_POSIX_INIT NimMainInit(void) {
ctt_init_NimMain();
}
N_LIB_PRIVATE N_NIMCALL(void, NimMainModule)(void) {
{
}
}
This is almost the wanted result. Tested and confirmed that N_LIB_PRIVATE void NIM_POSIX_INIT NimMainInit
does the right thing :tm:.
Only issue is that the NimMain
is tagged N_LIB_EXPORT
but I don't think it should?
N_LIB_PRIVATE void PreMainInner(void) {
atmdotdotatsconstantineatsplatformsatsisaatscpuinfo_x86dotnim_Init000();
}
N_LIB_PRIVATE int cmdCount;
N_LIB_PRIVATE char** cmdLine;
N_LIB_PRIVATE char** gEnv;
N_LIB_PRIVATE void PreMain(void) {
atmdotdotatsdotdotatsdotdotatsdotdotatsdotchoosenimatstoolchainsatsnimminus1dot6dot12atslibatssystemdotnim_Init000();
PreMainInner();
}
N_LIB_PRIVATE N_CDECL(void, NimMainInner)(void) {
NimMainModule();
}
N_CDECL(void, ctt_init_NimMain)(void) {
PreMain();
NimMainInner();
}
int main(int argc, char** args, char** env) {
cmdLine = args;
cmdCount = argc;
gEnv = env;
ctt_init_NimMain();
return nim_program_result;
}
N_LIB_PRIVATE N_NIMCALL(void, NimMainModule)(void) {
{
}
}
That's not what we want.
Somewhat related, a name like atmdotdotatsdotdotatsdotdotatsdotdotatsdotchoosenimatstoolchainsatsnimminus1dot6dot12atslibatssystemdotnim_Init000
is a bug.
For my use-case, I have created a loadTime
macro pragma that allows a proc to be called at program or library load time, it works whether the code is compiled to an application, dynamic or static library.
Note: MSVC/VCC support to be confirmed. And unsure about TCC
import std/macros
const GCC_Compatible* = defined(gcc) or defined(clang) or
defined(llvm_gcc) or defined(icc)
macro loadTime*(procAst: untyped): untyped =
## This allows a function to be called at program or library load time
## Note: such a function cannot be dead-code eliminated.
procAst.addPragma(ident"used") # Remove unused warning
procAst.addPragma(ident"exportc") # Prevent the proc from being dead-code eliminated
if GCC_Compatible:
# {.pragma: gcc_constructor, codegenDecl: "__attribute__((constructor)) $# $#$#".}
let gcc_constructor =
nnkExprColonExpr.newTree(
ident"codegenDecl",
newLit"__attribute__((constructor)) $# $#$#"
)
procAst.addPragma(gcc_constructor) # Implement load-time functionality
result = procAst
elif defined(vcc):
warning "CPU feature autodetection at Constantine load time has not been tested with MSVC"
template msvcInitSection(procDef: untyped): untyped =
let procName = astToStr(def)
procDef
{.emit:["""
#pragma section(".CRT$XCU",read)
__declspec(allocate(".CRT$XCU")) static int (*p)(void) = """, procName, ";"].}
result = getAst(msvcInitSection(procAst))
else:
error "Compiler not supported."
Somewhat related, a name like atmdotdotatsdotdotatsdotdotatsdotdotatsdotchoosenimatstoolchainsatsnimminus1dot6dot12atslibatssystemdotnim_Init000 is a bug.
Seems like 2 things create those kind of proc names:
var foo {.global.} = someProc()
Followup on the discussion at: https://discord.com/channels/371759389889003530/768367394547957761/1130053134727782410
Currently using a Nim libraries usually requires calling NimMain to initialize global variables and Nim runtime.
This is extra friction, especially when we want to replicate C libraries that don't require this.
Motivating example
For example, many scientific libraries can autodetect support for CPU features either through the compiler by re-using the same function name but with different target features:
There are several ways to implement this, from Agner Fog https://www.agner.org/optimize/optimizing_cpp.pdf section 13.5, there are atleast:
And GCC function multiversioning: https://gcc.gnu.org/wiki/FunctionMultiVersioning
Current situation
We assume that we only want to ask for CPU capabilities once and not at each function call. Hence we need to:
But as a library provider, this backend part is something that is ideally hidden and only the functions interesting for the user are exposed like
compute_matrix_multiplication
orverify_cryptographic_signature
Due to Nim globals being initialized in
NimMain
, this is currently not supported. Furthermore, function multi-versioning will not work IIRC, even withcodegendecl
for target attributes, as Nim will not compile functions with colliding C names.A workaround is to use an
__attribute__((constructor))
function, possibly__attribute__((constructor,used))
(in case of zealous dead-code elimination by LTO) for each global a library needs to initialize. However this is limited to globals that don't require Nim runtime (so seqs, strings, ref are excluded)Low-level - Unix
Looking at my library: https://github.com/mratsim/constantine/blob/67fbd8c/constantine/ethereum_bls_signatures.nim, compiled with
--mm:arc and -d:UseMalloc --panics:on -d:noSignalHandler
to ensure no runtime (allocator, exceptions which all needs an allocator, signals, ...), the NimMain related functions are:As mentioned in https://discord.com/channels/371759389889003530/768367394547957761/1130212409496322098, one of the motivation for the explicit call was for the old GCs to determine the stack size, I assume for stack scanning of pointers. And there are apparently other initialization routines (which?).
It's also interesting to note that
nimbase.h
definesAnd it's supposed to be used in cgen for
PosixCDllMain
/NimMainInit
:but
NimMainInit
doesn't appear anywhere in my generated C code.Low-level - Windows
MSVC provides a similar mechanism: https://github.com/supranational/blst/blob/f8af94a/src/cpuid.c#L47
Questions
NimMainInit
built into a library, as this would solve 1?