snowkit / hxcpp-guide

A guide for the Haxe CPP target build systems, CFFI and APIs
80 stars 8 forks source link

CFFI / CFFI Prime : everything I know #1

Open larsiusprime opened 8 years ago

larsiusprime commented 8 years ago

This is a scratch pad issue. Just a brain dump of everything I know. Organization comes later.

There are three basic methods for getting Haxe to directly interoperate with C++.

  1. CFFI (legacy)
  2. CFFI Prime
  3. C++ Externs

See this answer for C++ externs: https://stackoverflow.com/questions/34168457/creating-a-haxe-hxcpp-wrapper-for-a-c-library

and LINC is a good example: http://snowkit.github.io/linc/#native-extern-tips

C++ Externs are probably the easiest and simplest thing to set up, the disadvantage is you are limited to C++ targets. If that's not a problem for you, you probably want C++ Externs. If you need other targets like neko and NodeJS, you might consider CFFI / CFFI Prime.


CFFI (legacy) and CFFI Prime are ways to get C++ (and neko, and perhaps other targets as well) to talk directly to the Haxe layer of your code.

Both are used in the SteamWrap library, if you want a practical example: https://github.com/larsiusprime/steamwrap

Today we'll discuss CFFI and CFFI PRIME

Both of these are C Foreign Function Interfaces, that allow the C++ and Haxe sides to communicate. The old version of CFFI (referred to as just plain "CFFI" from hereon) differs from CFFI PRIME in that it is slower and relies on boxed dynamic typing. CFFI PRIME's advantage is that it does not rely on boxing values -- you can take the same value type and pass it transparently between C++ and Haxe; this is not only faster but avoids extra allocations which can cause GC activity. Also, for non C++ targets such as neko, CFFI PRIME is backwards-compatible and transparently degrades into the legacy CFFI behavior, so for a large subset of cases CFFI PRIME is probably better. In practice there are some cases where you still might want to use regular CFFI, but otherwise CFFI PRIME is recommended wherever it will work.

CFFI PRIME builds on top of the concepts of CFFI, so we'll discuss CFFI first.

1. CFFI

So say you have this Haxe class:

Class Foo
{
    public var ForeignFunction:Dynamic;
    public function new() {}
}

Right now, calling Foo.ForeignFunction() will do nothing as it's not defined (in fact it's null so it will crash). Instead, we would like to make it connect to a C++ class.

So over in C++ land we create another file:

#include <hx/CFFI.h>

class Foreign
{
    value CPP_ForeignFunction(value haxeVal)
    {
        int intVal = val_int(haxeVal);
        printf("CPP: intVal = %d\n",intVal);
        intVal++;
        value returnVal = alloc_int(intVal);
        return returnVal;
    }
    DEFINE_PRIM(CPP_ForeignFunction, 1);
}

This function will receive an int, print out its value, add one to it, and return it.

Let me explain the code.

First, we're including the HXCPP header file for the CFFI class. Next, we're defining our C++ function for Haxe to talk to.

value CPP_ForeignFunction(value haxeVal)

You'll notice that the return type here is "value", and it also takes 1 parameter of type "value", and "value" is not a native C++ type. This is a special CFFI type that can (basically) hold anything, it's like a Haxe Dynamic. Think of this as a box that has to be unwrapped before you can use it in C++, and has to have something put into it before you can hand it back to Haxe.

CFFI comes with several functions for boxing and unboxing "value" types:

PLEASE CORRECT IF WRONG: val_int val_string val_object val_bool val_float (etc)

There are also several utility functions for checking if a value type is one of these types underneath, since calling val_int() on an underlying string could cause issues.

PLEASE CORRECT IF WRONG: val_is_int val_is_string val_is_object val_is_bool val_is_float (etc)

So when you want to get a value out of your boxed types from haxe, you do:

int intVal = val_int(haxeVal);

And when you want to pass something back, you do call one of these functions to box it back up:

PLEASE CORRECT IF WRONG: alloc_int alloc_bool alloc_string alloc_object alloc_float (etc)

    value returnVal = alloc_int(intVal);
    return returnVal;

I split that into two lines so you can clearly see that the type of the returnVal here is "value". Even if your original C++ type was an int, or a string, or a bool, it all goes back to Haxe as a "value."

Finally, look at this line at the bottom of the C++ function:

DEFINE_PRIM(CPP_ForeignFunction, 1);

That defines a "primitive" for Haxe to be able to recognize. You pass it the function name (NOTE: not a string representation of the identifier, just the identifier itself) and how many parameters it should receive from Haxe. In this case, it will receive one parameter.

Okay, great, we set up this function that's totally ready to talk to Haxe, but we still haven't bridged the gap.

Let's go back to our Haxe class:

Class Foo
{
    public var ForeignFunction:Dynamic;
    public function new() {}
}

We add a few things:

package foo;
import cpp.Lib;

Class Foo
{
    public var ForeignFunction:Dynamic;
    public function new()
    {
        ForeignFunction = cpp.Lib.load("foo", "CPP_ForeignFunction",1);
    }
}

We've done three things here --

  1. Add a package to our class (NOTE: someone tell me why/if this is necessary)
  2. Import cpp.Lib from the Haxe standard library
  3. In the constructor, load the C++ function into our "ForeignFunction" member variable

This is the meat of it:

        ForeignFunction = cpp.Lib.load("foo", "CPP_ForeignFunction",1);

The cpp.Lib.load command takes three things --

  1. Your class' package name
  2. The name of your function on the C++ side (is this right?)
  3. The number of parameters you are passing to C++ (is this right?)

And that's it! You're done. Now you can do this:


var myFoo = new Foo();
var result = myFoo.ForeignFunction(1);     //outputs "CPP: Intval = 1";
trace(result);                             //outputs "2";

THINGS THAT SHOULD BE CONFIRMED/ADDED/BLAH BLAH:

RANDOM NOTE: Strings

Strings are something that are important not to get confused about. There are at least three types of strings:

I've noticed that strings are often passed back to Haxe from CFFI in this manner:

alloc_string("STRING CONSTANT");
const char * myCString = getCStringSomehow();
alloc_string(myCString);
std::ostringstream myStream;
myStream << "DO " << "SOME " << "BUSINESS " << "LOGIC " << someVal << otherVal;
return alloc_string(myStream.str().c_str());

So it seems that with CFFI at least you're passing back const char * rather than std::string or std::ostringstream

2. CFFI PRIME

CFFI PRIME works very much like CFFI except that it has performance benefits and can communicate more directly because it doesn't have to box its values. In practice this means there are some trickier edge cases -- I've generally found CFFI to be a bit more flexible. The "E" in "PRIME" doesn't stand for anything, as far as I can tell it's a pun on the "DEFINE_PRIM" call, indicating that CFFI PRIME is better ;P

So back to our previous example, we had this in Haxe:

package foo;
import cpp.Lib;

Class Foo
{
    public var ForeignFunction:Dynamic;
    public function new()
    {
        ForeignFunction = cpp.Lib.load("foo", "CPP_ForeignFunction",1);
    }
}

and this in C++:

#include <hx/CFFI.h>

class Foreign
{
    value CPP_ForeignFunction(value haxeVal)
    {
        int intVal = val_int(haxeVal);
        printf("CPP: intVal = %d\n",intVal);
        intVal++;
        value returnVal = alloc_int(intVal);
        return returnVal;
    }
    DEFINE_PRIM(CPP_ForeignFunction, 1);
}

Let's start by changing some stuff in Haxe:

package foo;
import foo.helpers.Loader;

Class Foo
{
    public var ForeignFunction = Loader.load("CPP_ForeignFunction","ii");
    public function new(){}
}

This is a lot simpler to write. Instead of typing ForeignFunction as Dynamic, and loading it up at runtime, here we load it up at compile time, and I believe it will wind up as strongly typed, to boot.

Similarly to CFFI, you give the name of the C++ side function you want, but instead of providing the number of parameters, you provide the number and the type encoded in a special string:

public var ForeignFunction = Loader.load("CPP_ForeignFunction","ii");

So "ii" here means -- takes an int, returns an int. CFFI doesn't specify anything about the return type, but CFFI Prime does, and requires explicit type information.

You build your type string like this:

So if you have a function that takes 3 integers and returns 1 float, it would be "iiif", or that takes a float, an integer, a string, and returns an object it would be "fico" (assuming you are passing the string to C++ as a const char *).

(this is apparently a "java style signature or something?" someone else can fill this in)

Of course, this requires the help of a macro class ("Loader"), which we'll define here:

package foo.helpers;

#if macro
import haxe.macro.Expr;
#end

class Loader
{
   #if cpp
   public static function __init__()
   {
      cpp.Lib.pushDllSearchPath( "" + cpp.Lib.getBinDirectory() );
      cpp.Lib.pushDllSearchPath( "ndll/" + cpp.Lib.getBinDirectory() );
      cpp.Lib.pushDllSearchPath( "project/ndll/" + cpp.Lib.getBinDirectory() );
   }
   #end

   public static inline macro function load(inName2:Expr, inSig:Expr)
   {
      return macro cpp.Prime.load("foo", $inName2, $inSig, false);
   }
}

(It seems to me that this sort of macro should be standard, it seems a little cumbersome to have to provide it yourself; also I'm not sure of what the syntax is without the helper macro. Perhaps it's just hooked to deeply into the build process to be of non-framework-specific use; anyways, I'll let someone else fill this detail in)

Now let's go to the C++ side and make some changes:

#include <hx/CFFIPrime.h>

class Foreign
{
    int CPP_ForeignFunction(int haxeVal)
    {
        printf("CPP: intVal = %d\n",intVal);
        intVal++;
        return intVal;
    }
    DEFINE_PRIME1(CPP_ForeignFunction);
}

First of all, we changed the include to CFFIPrime.h. Second, we removed the "value" type and exchanged for the strong types -- in this case int for both the return and argument value. This must match the signature supplied on the Haxe side or you will get a compile-time error (CFFI did not enforce this, so you could send an int, and then try to read it as a string, and possibly cause run time errors).

Notice we do not have to unbox anything. We can just use the value directly. Also, we don't need to box anything up to return it -- we just pass it right back to Haxe.

The "DEFINE_PRIM" has now become "DEFINE_PRIME1" -- instead of putting the number of parameters received as an argument to "DEFINE_PRIME()" you instead call a variant of "DEFINE_PRIME" that includes the number of parameters, in this case "DEFINE_PRIME1()".

NOTE: I believe if you have a CFFI PRIME function that returns void you need to add a "v" to the end of "DEFINE_PRIME" so, "DEFINE_PRIME1v" for "take one parameter, return void" -- this needs checking.

Now we go back to Haxe:

var myFoo = new Foo();
var result = myFoo.ForeignFunction.call(1);     //outputs "CPP: Intval = 1";
trace(result);                                 //outputs "2";

The only difference in this case is you need to add ".call()" to invoke your CFFI PRIME function. (I think there's a way to avoid this with an abstract or a macro but I don't know how yet.)

THINGS I NOTICED BUT DON'T TOTALLY UNDERSTAND ABOUT CFFI PRIME YET:

ghost commented 8 years ago

Good job! Thank you very much.

jcward commented 8 years ago

Hmm, I was trying to realize your simple CFFI example, and building it has me tripped up. How do you actually compile the cpp, and how do you link the output of the cpp step with the output of the Haxe compile? I suppose I'll have to dig into your steamwrap example.

ruby0x1 commented 8 years ago

Probably built using a Build.xml file

jcward commented 8 years ago

Thanks for the link, and good thing, 'cause my memory of us talking about that is fading. :)

Hmm, I get the same error message using a Build.xml: http://hastebin.com/kehetatili.avrasm

Ok, so I got this compiling with help from the old cffi tutorial -- the above was missing: no class, IMPLEMENT_API, and extern "C":

#define IMPLEMENT_API
#include <hx/CFFI.h>

extern "C"
{

  value CPP_ForeignFunction(value haxeVal)
  {
    int intVal = val_int(haxeVal);
    printf("CPP: intVal = %d\n",intVal);
    intVal++;
    value returnVal = alloc_int(intVal);
    return returnVal;
  }
  DEFINE_PRIM(CPP_ForeignFunction, 1);

}
jcward commented 8 years ago

Hey guys, I have the CFFI sample working. Are you interested in it? I can put it in this repo if you add me, or in a gist or something?

ruby0x1 commented 8 years ago

Yea sure, a PR could work since it would be easier to review the changes and clean up commits before merging, if you don't mind that workflow?

jcward commented 8 years ago

Works for me. I'll send it later today.

ruby0x1 commented 8 years ago

CFFI is a later chapter btw, so just stick it under work-in-progress/cffi/examples/simple in the mean time, I've been drafting up a index for the guide so we know where things will slot in, but don't have it yet. Once we have the overview we can slot things in better.

larsiusprime commented 8 years ago

@jcward: is anything needed to update / correct my brain-dump post on CFFI stuff? The build process was the bit I was unclear about. Once it's no longer missing anything and has been properly fact-checked it can be turned into a proper chapter or whatever.

ruby0x1 commented 8 years ago

Yea, it's fine to have all the content in the work in progress folder and we'll collate it as needed when that's better defined.

jcward commented 8 years ago

Ok, left a PR #5 -- lots of items to review but it's a start. Cheers.

ruby0x1 commented 7 years ago

I've stubbed in the CFFI folder and wrapped up the build parts so long. Ideally what would happen for the CFFI examples is that they build on top of the groundwork of the build section. This section goes through how to invoke a build, make a build.xml file, specify targets, build statically etc as well as configure the bin paths, and naming options typically found in ndlls.

From there, it's a short step to making the dynamic link explicit, the c++ code part being changed out and adding a hx file and build.hxml in two parts. It should ideally also follow convention of project/ ndll/ and the haxe code in it's package folder.

If nobody gets to it by the time I'm done my current stuff (cpp, cppia, haxe->dll example), I'll get around to taking what Lars has written into the files and restructuring and all that but all I'm doing now is noting that the cffi space is carved out, the build section is rounded out, and the existing examples are a stones throw from a simple cffi prime example.

As you may notice in the md files, it seems less ideal to put cffi legacy stuff as the main workflow, we should prioritize the prime stuff since that's the more efficient and modern approach. Any thoughts on that?