jnr / jffi

Java Foreign Function Interface
Apache License 2.0
168 stars 77 forks source link

Use JIT-compiled stubs instead of LibFFI where possible #37

Open DemiMarie opened 8 years ago

DemiMarie commented 8 years ago

In some cases, it is possible to avoid using libffi, instead generating a machine-code trampoline that just shuffles the parameters and tail-calls the native function. This should be faster.

This won't work for the case of critical natives, though, unless we are able to make the JVM's dynamic lookup (via dlopen and dlsym) find the JIT-compiled code. This is because critical natives cannot be registered using RegisterNatives.

Spasi commented 7 years ago

This won't work for the case of critical natives, though, unless we are able to make the JVM's dynamic lookup (via dlopen and dlsym) find the JIT-compiled code. This is because critical natives cannot be registered using RegisterNatives.

I was trying this a few days ago and initial testing looks promising. Sample implementation here. Used the following libraries:

Works on all three, but care must be taken when calling back to Java; anything that touches native code will deadlock the JVM.

headius commented 7 years ago

jnr-ffi already does use jnr-x86asm to generate native stubs for some cases. We could expand that, certainly. What else?

DemiMarie commented 7 years ago

I was thinking about using a simple stub code generator (much simpler than jnr-x86asm) to generate machine code stubs for other platforms. This should work, because the stubs have a very simple form:

mov    %rdx,%rdi
mov    %rcx,%rsi
mov    %r8,%rdx
mov    %r9,%rcx
mov    0x8(%rsp),%r8
mov    0x10(%rsp),%r9
mov    0x18(%rsp),%rax
mov    %rax,0x8(%rsp)
mov    0x20(%rsp),%rax
mov    %rax,0x10(%rsp)
mov    0x28(%rsp),%rax
mov    %rax,0x18(%rsp)
movabs $0x1,%rax
jmpq   0x402e80

In other words, a series of mov instructions in registers, followed by a series of mov instructions relative to the stack pointer, followed by a jmp. This can be generated with fairly simple code, especially if one knows the absolute address into which the code will be placed in memory.

ghost commented 7 years ago

@DemiMarie Would be good if it can be implemented since jnr-x86asm hasn't been updated in years.

headius commented 7 years ago

@DemiMarie Well it sounds like you have some idea how this might be done. Care to assist?

Techcable commented 7 years ago

I'd look into this, which uses DynASM to compile lua call stubs at runtime. Alternatively, you could try a higher level library like gnu lightning to handle the code generation for you (although I'm not sure about windows calling convention support).