sinkingsugar / fragments

Collection of pure Nim utilities
MIT License
57 stars 5 forks source link

ffi: comparison with Calypso, feature set, limitations? #1

Open timotheecour opened 6 years ago

timotheecour commented 6 years ago

just heard of fragments through https://t.me/nim_lang ; I'm curious to what extent it fullfills what I was looking for in https://github.com/nim-lang/Nim/issues/8327?

design discussions

scratch below here

minor

jwollen commented 6 years ago

Hi! Thanks for the feedback. This is still very much work in progress and grows based on our own use cases. We will improve comments and code quality much more!

We also used libclang quite a bit to generate wrappers before. For simple C APIs that's fine. The major downside is that each codebase needs some tweaking and dealing with #if/def is sometimes just not possible.

what are limitations are there in terms of understanding/parsing C++ code? what are limitations are there in terms of mapping C++ concepts to nim?

Code is only emitted, not parsed, so it's mostly a matter of coming up with a Nim-syntax. A downside is the lack of linting/auto completion in Nim. For diagnostics we have to rely on the C++ compiler.

can it handle C++ stdlib? maybe show an example that would wrap std::vector (note: should provide a view over data, not a mere copy)?

In general the stdlib will work fine. Templates are still a bit cumbersome and instantiations have to be declared manually. I'd love to end up with something like type IntVec = std/vector[int] though. There are some things that will need a bit more thought too, like bridging Nim/C++ allocators, threading etc.

can it handle complex things like wrapping opencv?

We were working with a few big code bases, including LLVM, but there is still a lot manual work involved. The main obstacle would be templates and inheriting from C++ types right now.

We would like to get to a point where namespaces, types, enum-members, etc. can all be accessed through a syntax like My/Sub/Namespace.MyEnum.SomeValue. Currently this has to be done in defineCppType for each type, has to explicitly name headers, etc.

what overhead is there compared to directly calling a C++ function/constructor, with and without compiler optimizations?

There is no overhead. All calls directly emit C++ code using {.importcpp.}. There are no wrapper types or functions.

support for C++ templates [...] without needing cpp source to instantiate type

Templates are not yet supported. For our use cases we imported template classes/functions manually so far. This is very possible though and mostly about coming up with a nice syntax that looks like Nim generics.

catch / rethrowing C++ exceptions?

When using nim cpp, Nim exceptions translate to C++ exceptions, so this would be trivial to implement!

why not split out ffi to its own nimble package? would keep issue tracking etc cleaner

We would love to turn the things in this repo into standalone/stdlib modules once they are more mature!

explicit enumeration for different number of arguments

We had some issues with varargs[untyped] here I believe. We will revisit it!

.to(void) => ugly

This will stay necessary for now. All "dynamic" method calls return a Nim type that doesn't correspond to a C++ type. The compiler would emit a local variable with that "fake" type without to. Even discarding it doesn't help. This can maybe be fixed in the future by returning a different type from each call which is imported with some decltype magic. That would also allow assigning those return types without to, e.g.

var local = myCppObject.someMethod()
echo local.to(cint)

converter toShort*(co: CppProxy): int16 {.used, importcpp:"(#)".} + friends => convertTo(int16)

I remember issues with generic converters...

timotheecour commented 6 years ago

thanks for all the answers!

I remember issues with generic converters...

Templates are not yet supported. For our use cases we imported template classes/functions manually so far. This is very possible though and mostly about coming up with a nice syntax that looks like Nim generics.

Here's how Calypso handles mapping C++ templates to D templates (example, for std::vector): https://github.com/Syniurge/Calypso/blob/master/tests/calypso/libstdc%2B%2B/vector.d, eg: auto v1 = vector!char; ; I really hope we'd likewise be allowed to use let v1 = vector[char]() or let v1 = vector(char) in Nim

in module use_opencv.nim, ie user code

from opencv import nil let val = opencv.My::Sub::Namespace::MyEnum.SomeValue



- [ ] NOTE: Nim's operator precedence could perhaps be tweaked to support this use case, so that `::` has highest precedence (higher than `.`)
jwollen commented 6 years ago

Some template magic in the emitted C++ should be able to take care of distinguishing between fields, types, etc. Generics and subtypes don't look too hard.

I didn't have a good idea for namespaces yet though. :: is not a valid operator and the ./.() operators are the only ones with custom matching rules. (I would prefer std.vector[int](5) anyway and unify ::, . and -> into dots).

When using a single operator for both, distinguishing on the C++ side is tricky though. While it's probably possible to template the emitted code depending on whether the right side is a type, instance/static field, etc., I didn't come up with a way to "overload" the emitted code for namespaces yet...

sinkingsugar commented 6 years ago

First of all thank you for this motivating discussion.

The drive of this cpp ffi since the beginning was always productivity, paying the price of precision and correctness leading to sub optimal linting (jsffi has the same issue), the need to specify the proper nim types and in the case of templates to qualify them as a real type. Yet those downsides for me were extremely minimal. Being able to completely skip wrapping, binding etc a library is priceless. Another priceless thing is being a pure nim module, only the real dependency has to be dealt with.

Like @jwollen said we experimented with libclang a lot, results were not optimal, c++ is very hard and some complex libraries like for example LLVM itself use templates in every possible way, properly supporting everything needs too much maintenance cost.

In the future I see us trying to add more compile time magic, specially never forgetting that c++ has compile time magic as well and combining smartly nim and c++ meta-programming might reach the best results with a minimal time investment compared to maintaining a full LLVM compiler pass.

template usage example

defineCppTYpe(ATensors, "std::vector<at::Tensor>", "vector")

basically we qualify them into a concrete nim type, ideally using nim templates would be nice but for now this works nicely.

can full support of complex C++ libraries be done without nim-lang/Nim#7449 ?

As soon as we implement a namespace macro/template yes, without any particular issue at all.