This PR is not fully ready yet, but I open it anyway to make room for discussion, be it conceptionally, about the design or what else!
Motivation
jnigen makes it easy to write JNI code, to integrate libraries. This PR tries to take this a step further, providing automatic java jnigen code generation for a header file, while also adding support for pointer types, structs, unions, callbacks.
The focus for development was put on:
Correctness/Portability
Thinness
Avoiding reflection
Inlinability, static linking and avoidance of symbol exposure
Interface design
Methods
Methods are bound by just generating jnigen wrapper code. This function in C int testFunc(int test); would be generated as:
public static int testFunc(int test) {
return testFunc_internal(test);
}
Callbacks
Callbacks are implemented as functional interfaces. They look something like:
public interface methodWithCallbackFloatArg extends Closure {
void methodWithCallbackFloatArg_call(float arg0);
}
To pass them to the native side, a ClosureObject needs to be constructed around them. Calling ClosureObject.fromClosure(your_callback); will create one.
A method expecting a closure looks like this:
public static void call_methodWithCallback(ClosureObject<methodWithCallback> fnPtr)
IMPORTANT: Closures do have a manual memory life cycle. They need to be freed manually. More on that later
Enum
C enums are implemented as java enums which have an ID (id != ordinal). Important to note is, that C enums can have duplicate ID's, while the java enums can't. To solve this, enums with the same ID will be merged together with an "_".
Struct/Union (The term "StackElement" will refer to both of them)
StackElements are implemented as classes. They are only pass by value. They have getter and setter methods, that are named by the field name. They can either have a manual or automatic lifecycle, more on that later. Unions can not be passed in closures.
Pointer
There are a lot of pointer types. What they have all in common is, that they can have manual or automatic memory management and that they all point to something. Their address can be retrieved by getPointer()
The basic ones are FloatPointer, DoublePointer and VoidPointer. They are pretty self explanatory.
CSizedIntPointer is a pointer, to an C integer. A CSizedIntPointer has a backing CType, that defines the type it points to. e.g. new CSizedIntPointer("int"); or new CSizedIntPointer("char");. This is needed, to correctly calculate the size of the CSizedIntPointer on a specific machine. The CType needs to be set to whatever is it supposed to be used for. If I want a int*, I need to do new CSizedIntPointer("int");. jnigen will do bound and type checks to ensure correctness.
Every Enum/StackElement has a inner class, that is their pointer type.
A StackElement can be converted to a pointer in two ways:
StackElement#asPointer reinterprets the StackElement as a pointer. Every change to the pointer will be reflected on the StackElement
Creating a Pointer and calling set, this will copy the StackElement
The same goes for converting back, asStackElement reinterprets the address and get copies the StackElement.
Last we have PointerPointer<T extends Pointing>. This class is used for every pointer, that goes deeper than one layer, like void**. When creating a PointerPointer you need to pass a supplier, how to create the dereference pointer object.
For a float** this would look like: new PointerPointer(FloatPointer::new). A int** can be created by: new PointerPointer<>(CSizedIntPointer.pointerPointer("int"));
The API gets a bit cumbersome for a depth of 3+, but this is very rare.
GC
Every C element, except Enums, are bound to dynamic memory management.
Closures always need manual memory management, if you don't need them anymore, you need to call ClosureObject#free.
All others have the rule: If they are created by java code and you don't set them to manual management manually, they will be freed by the GC. If the come from native code (even if they maybe origin in java code), than they are under manual management and it is your responsibility to free them, if needed.
Exceptions
C++ exceptions are handled and implemented as a CXXException. The code should be compiled with -fexceptions to work properly.
So lets dissect this.
First of all, this design for static linking.
But one major caveat is, that a "jint" is not guaranteed to have the same size as a "int", the downcast might be non-functional. To address this, the generator is supposed to always pick the java type, that is guaranteed to hold the C type in all cases (the generator is therefor supposed to run on a 64bit machine). On the C side, we introduce a runtime check:
CHECK_AND_THROW_C_TYPE(env, int, test, 0, return 0);
This is a mostly compile-time macro, that is supposed to check the bounds of an number, whether it fits in a C type. If it fails, it will throw a java exception.
Than we have HANDLE_JAVA_EXCEPTION_START/END(). This is just a C++ try catch, that does three things:
If a closure throws a Java exception, this will be converted to a JavaExceptionMarker C++ exception. Now if we encounter a JavaExceptionMarker, we will set the backing java exception and eat the C++ exception. This allows throwing java exceptions through C code.
We encounter a std::exception, than we will create a CXXException based on exception::what();
We encounter something else thrown, than we will just call it an unknown error on java side.
If a Java exception is converted to a JavaExceptionMarker, it will need to read the stacktrace for exception::what(). This can be expensive and disabled with CHandler.setDisableCXXExceptionMessage(true);
About FFITypes.java
Every binding process will generate a FFITypes class. The purpose of this class is, to map CTypes to libFFI types. The generator will emit a compile-time macro for every c type it encounters. This way on runtime, the exact size details for on this platform can be retrieved. FFITypes also handles struct FFI types.
About Closures
Closures are implemented with libFFI closures. They define a signature, that is retrieved from the FFITypes. Below is an example:
A function call from C -> Java works like the following:
The ffi closure will do the argument packing and call callbackHandler in the CHandler class. There we will go over all arguments in the args array. If we encounter any number, we will copy it. If we encounter a struct, we allocated a new pointer, put the struct into it, and put the pointer into the new args array.
Now we allocate a DirectByteBuffer that wraps the pointer, and pass it to java with CHandler#dispatchCallback, where it will be unpacked into java values.
About CHandler#reExportSymbolsGlobally
On unix, System#load calls dlopen to open a shared library. On linux, this defaults to RTLD_LOCAL, on mac to RTLD_GLOBAL. To have easier symbol resolution when opening depending shared libs, we rexport the symbols on linux explicitly with RTLD_GLOBAL.
This would be all for the moment, if I have forgotten something I will append it. If there are any questions left, please ask!
Outstanding tasks
[x] Add CI builds for all targets
[ ] Test on 32bit machines and other
[ ] Test it on actual libraries to bind
[ ] variadic is unsupported
[ ] Performance tests
This PR is not fully ready yet, but I open it anyway to make room for discussion, be it conceptionally, about the design or what else!
Motivation
jnigen makes it easy to write JNI code, to integrate libraries. This PR tries to take this a step further, providing automatic java jnigen code generation for a header file, while also adding support for pointer types, structs, unions, callbacks. The focus for development was put on:
Interface design
Methods
Methods are bound by just generating jnigen wrapper code. This function in C
int testFunc(int test);
would be generated as:Callbacks
Callbacks are implemented as functional interfaces. They look something like:
To pass them to the native side, a
ClosureObject
needs to be constructed around them. CallingClosureObject.fromClosure(your_callback);
will create one. A method expecting a closure looks like this:IMPORTANT: Closures do have a manual memory life cycle. They need to be freed manually. More on that later
Enum
C enums are implemented as java enums which have an ID (id != ordinal). Important to note is, that C enums can have duplicate ID's, while the java enums can't. To solve this, enums with the same ID will be merged together with an "_".
Struct/Union (The term "StackElement" will refer to both of them)
StackElements are implemented as classes. They are only pass by value. They have getter and setter methods, that are named by the field name. They can either have a manual or automatic lifecycle, more on that later. Unions can not be passed in closures.
Pointer
There are a lot of pointer types. What they have all in common is, that they can have manual or automatic memory management and that they all point to something. Their address can be retrieved by
getPointer()
The basic ones areFloatPointer
,DoublePointer
andVoidPointer
. They are pretty self explanatory.CSizedIntPointer
is a pointer, to an C integer. ACSizedIntPointer
has a backingCType
, that defines the type it points to. e.g.new CSizedIntPointer("int");
ornew CSizedIntPointer("char");
. This is needed, to correctly calculate the size of theCSizedIntPointer
on a specific machine. The CType needs to be set to whatever is it supposed to be used for. If I want aint*
, I need to donew CSizedIntPointer("int");
. jnigen will do bound and type checks to ensure correctness.Every Enum/StackElement has a inner class, that is their pointer type. A StackElement can be converted to a pointer in two ways:
StackElement#asPointer
reinterprets the StackElement as a pointer. Every change to the pointer will be reflected on the StackElementset
, this will copy the StackElement The same goes for converting back,asStackElement
reinterprets the address andget
copies the StackElement.Last we have
PointerPointer<T extends Pointing>
. This class is used for every pointer, that goes deeper than one layer, likevoid**
. When creating a PointerPointer you need to pass a supplier, how to create the dereference pointer object. For afloat**
this would look like:new PointerPointer(FloatPointer::new)
. Aint**
can be created by:new PointerPointer<>(CSizedIntPointer.pointerPointer("int"));
The API gets a bit cumbersome for a depth of 3+, but this is very rare.GC
Every C element, except Enums, are bound to dynamic memory management. Closures always need manual memory management, if you don't need them anymore, you need to call
ClosureObject#free
. All others have the rule: If they are created by java code and you don't set them to manual management manually, they will be freed by the GC. If the come from native code (even if they maybe origin in java code), than they are under manual management and it is your responsibility to free them, if needed.Exceptions
C++ exceptions are handled and implemented as a
CXXException
. The code should be compiled with-fexceptions
to work properly.Implementation details
About functions
The native part is implemented like this:
So lets dissect this. First of all, this design for static linking. But one major caveat is, that a "jint" is not guaranteed to have the same size as a "int", the downcast might be non-functional. To address this, the generator is supposed to always pick the java type, that is guaranteed to hold the C type in all cases (the generator is therefor supposed to run on a 64bit machine). On the C side, we introduce a runtime check:
CHECK_AND_THROW_C_TYPE(env, int, test, 0, return 0);
This is a mostly compile-time macro, that is supposed to check the bounds of an number, whether it fits in a C type. If it fails, it will throw a java exception. Than we haveHANDLE_JAVA_EXCEPTION_START/END()
. This is just a C++ try catch, that does three things:JavaExceptionMarker
C++ exception. Now if we encounter a JavaExceptionMarker, we will set the backing java exception and eat the C++ exception. This allows throwing java exceptions through C code.std::exception
, than we will create a CXXException based onexception::what();
If a Java exception is converted to a JavaExceptionMarker, it will need to read the stacktrace for
exception::what()
. This can be expensive and disabled withCHandler.setDisableCXXExceptionMessage(true);
About FFITypes.java
Every binding process will generate a FFITypes class. The purpose of this class is, to map CTypes to libFFI types. The generator will emit a compile-time macro for every c type it encounters. This way on runtime, the exact size details for on this platform can be retrieved. FFITypes also handles struct FFI types.
About Closures
Closures are implemented with libFFI closures. They define a signature, that is retrieved from the FFITypes. Below is an example:
A function call from C -> Java works like the following: The ffi closure will do the argument packing and call
callbackHandler
in the CHandler class. There we will go over all arguments in theargs
array. If we encounter any number, we will copy it. If we encounter a struct, we allocated a new pointer, put the struct into it, and put the pointer into the new args array. Now we allocate aDirectByteBuffer
that wraps the pointer, and pass it to java withCHandler#dispatchCallback
, where it will be unpacked into java values.About CHandler#reExportSymbolsGlobally
On unix,
System#load
callsdlopen
to open a shared library. On linux, this defaults toRTLD_LOCAL
, on mac toRTLD_GLOBAL
. To have easier symbol resolution when opening depending shared libs, we rexport the symbols on linux explicitly withRTLD_GLOBAL
.This would be all for the moment, if I have forgotten something I will append it. If there are any questions left, please ask!
Outstanding tasks
[x] Add CI builds for all targets [ ] Test on 32bit machines and other [ ] Test it on actual libraries to bind [ ] variadic is unsupported [ ] Performance tests