bytedeco / javacpp

The missing bridge between Java and native C++
Other
4.5k stars 585 forks source link

Improve Parser: Use the Clang API #51

Open Arcnor opened 8 years ago

Arcnor commented 8 years ago

I'm in the process of doing this right now. Currently, the following issues exists with the approach I'm taking:

saudet commented 8 years ago

BTW, we probably want to use the C++ API of Clang for this. It is not currently mapped by the presets, so as initial work, we would either have to:

  1. Code the new parser temporarily in C++, or
  2. Create the presets for the C++ API of Clang, using the current Parser.

Either way is fine with me. Thanks for your interest in this project and let me know how I can help!

Arcnor commented 8 years ago

Sorry, I haven't had the time to work on this lately.

My final changes allowed me to parse a lot, but some missing things from the C++ API prevented me finishing it IIRC, so I think option 2 should help us get there (or, as you said, code it in C++, but it's non-trivial :D)

Arcnor commented 8 years ago

I've continued a bit on this, and for now, I've decided to write the Clang bindings manually, as I think the surface we need from Clang is not that big (I might be wrong, though).

Once we hace a working parser, we can generate proper bindings for Clang itself and use them in the generator, closing the circle ;)

I have some doubts about how to implement some of the bindings, though, so I'll hit the forums in a few days with my questions :)

saudet commented 6 years ago

@Arcnor Any progress with this?

libclang seems to be getting pretty good for that sort of thing, for example: https://github.com/rust-lang-nursery/rust-bindgen https://rust-lang-nursery.github.io/rust-bindgen/ So maybe we won't need to use the C++ API after all...

Arcnor commented 6 years ago

Hi Samuel,

No, unfortunately I haven't had the time to continue, not enough incentive for me to do so right now (the project(s) that were using JavaCPP all stopped for one reason or another, mostly priorities).

So libclang is getting better, eh? That's great news! I've had a quick look at the API again, and it seems to contain some goodies I don't remember from 2 years ago, so yeah, maybe now it's enough for our purposes.

If they kept the same names on the AST I might even be able to reuse some of the code I made years ago that parsed the unstable AST output of CLang (https://github.com/Arcnor/objc2robovm/blob/master/src/main/java/com/arcnor/objcclang/parser/CLangHandler.java for example).

Anyway, unless somebody else is working on this, I'll try to give it another look if it doesn't look too complex to interact with it, as time is limited :).

saudet commented 6 years ago

Looks good! No one else is looking into this AFAIK, so please do continue to check it out! The C API is pretty stable BTW. Thanks

Arcnor commented 6 years ago

I'm going to need some help to generate the bindings it seems. I'll put my question(s) here as they are related, but if you need me to use the forum I'll go there instead:

The bindings have the following code:

typedef struct CXVirtualFileOverlayImpl *CXVirtualFileOverlay;

CXVirtualFileOverlay clang_VirtualFileOverlay_create(unsigned options);
...

How can I rename CXVirtualFileOverlayImpl to CXVirtualFileOverlay? I've tried with javaNames but that's obviously not for that.

Arcnor commented 6 years ago

So besides that small problem, the whole API seems to work (well, at least compile) with very few manual mappings, which is cool.

I'll take more time tomorrow to actually figure out if the stuff I couldn't do ~2 years ago is now possible :).

saudet commented 6 years ago

Sounds good! BTW, the bindings for libclang are already available here: https://github.com/bytedeco/javacpp-presets/blob/master/llvm/src/main/java/org/bytedeco/javacpp/clang.java AFAIK, we just need to use those.

If there's anything to fix about those though, please send pull requests against the presets config: https://github.com/bytedeco/javacpp-presets/blob/master/llvm/src/main/java/org/bytedeco/javacpp/presets/clang.java Thanks!!

Arcnor commented 6 years ago

Ahh, I didn't realize that. Thanks, I'll start using those.

The only change I'd like is generating them against 5.0.0 if they're not already, and making some functions return String instead of BytePointer, I'll open a PR later when I get the time.

On Dec 7, 2017 17:24, "Samuel Audet" notifications@github.com wrote:

Sounds good! BTW, the bindings for libclang are already available here: https://github.com/bytedeco/javacpp-presets/blob/master/ llvm/src/main/java/org/bytedeco/javacpp/clang.java AFAIK, we just need to use those.

If there's anything to fix about those though, please send pull requests against the presets config: https://github.com/bytedeco/javacpp-presets/blob/master/ llvm/src/main/java/org/bytedeco/javacpp/presets/clang.java Thanks!!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bytedeco/javacpp/issues/51#issuecomment-349875046, or mute the thread https://github.com/notifications/unsubscribe-auth/AABvyIQQvMfdHkhow8JHNefQ7-6VZEZkks5s94SUgaJpZM4GuXQL .

saudet commented 6 years ago

They already work with LLVM 5.0.0 yes: https://github.com/bytedeco/javacpp-presets/tree/master/llvm

All const char * should already get mapped to String as well as BytePointer, but if there are char * that should also be mapped to String, yes please, do let me know! Thanks

Arcnor commented 6 years ago

They get converted to both, yeah. Except in the return value. I changed all return values to be String instead of BytePointer.

The most obvious use is for the getCString method to convert a CXString, but I'm sure there are others.

On Dec 7, 2017 18:21, "Samuel Audet" notifications@github.com wrote:

They already work with LLVM 5.0.0 yes: https://github.com/bytedeco/javacpp-presets/tree/master/llvm

All const char should already get mapped to String as well as BytePointer, but if there are char that should also be mapped to String, yes please, do let me know! Thanks

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bytedeco/javacpp/issues/51#issuecomment-349884119, or mute the thread https://github.com/notifications/unsubscribe-auth/AABvyDBy9tIChcWsbUcvo-ecY_UhmYk7ks5s95HwgaJpZM4GuXQL .

saudet commented 6 years ago

Right, the problem with return values is that when we need a Pointer, we can't get one from the String, but we can get a String from a BytePointer with getString()...

Arcnor commented 6 years ago

Yeah, I understand, I'm just hoping that all "const char*" returns are for Strings. Of course, I might be wrong, but it should be more or less straightforward to check, I'll do that later.

On Dec 7, 2017 19:34, "Samuel Audet" notifications@github.com wrote:

Right, the problem with return values is that when we need a Pointer, we can't get one from the String, but we can get a String from a BytePointer with getString()...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bytedeco/javacpp/issues/51#issuecomment-349898609, or mute the thread https://github.com/notifications/unsubscribe-auth/AABvyKHTKdKSN6oxH3-QElS_VwDIKCdTks5s96MZgaJpZM4GuXQL .

saudet commented 6 years ago

If not, we can add to CXString a helper String getString() { return getCString().getString(); } method :)

saudet commented 6 years ago

Actually, after calling clang_getCString() we need to call clang_disposeString(), so simply returning a String isn't that convenient. I've added the helper function I talked about in the commit above.

Arcnor commented 6 years ago

Ah, you're right, I didn't read that properly.

Ok, you win this one ;)

On Dec 8, 2017 00:51, "Samuel Audet" notifications@github.com wrote:

Actually, after calling clang_getCString() we need to call clang_disposeString(), so simply returning a String isn't that convenient. I've added the helper function I talked about in the commit above.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bytedeco/javacpp/issues/51#issuecomment-349972402, or mute the thread https://github.com/notifications/unsubscribe-auth/AABvyBcvbokGz-hp1U_81IBxEJCf0Fg6ks5s9-1pgaJpZM4GuXQL .

Arcnor commented 6 years ago

I've finally checked this properly, and it seems this method was the only one that made sense to have as BytePointer, as it has the dispose. As far as I can see, the others (except clang_EvalResult_getAsStr which also has special disposing) only make sense as String (like CXUnsavedFile.Filename() or clang_getTUResourceUsageName())

Anyway, for now I'll continue as it is, we can always improve things later without many changes.

Arcnor commented 6 years ago

I'm now getting crashes (randomly, like 1 for every 5 executions or so) like this one:

Stack: [0x000070000836e000,0x000070000846e000],  sp=0x000070000846cf30,  free space=1019k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
j  org.bytedeco.javacpp.Loader.offsetof(Ljava/lang/Class;Ljava/lang/String;)I+11
j  org.bytedeco.javacpp.Pointer.offsetof(Ljava/lang/String;)I+16
j  org.bytedeco.javacpp.Pointer.sizeof()I+24
j  org.bytedeco.javacpp.Pointer$DeallocatorReference.<init>(Lorg/bytedeco/javacpp/Pointer;Lorg/bytedeco/javacpp/Pointer$Deallocator;)V+29
j  org.bytedeco.javacpp.Pointer$NativeDeallocator.<init>(Lorg/bytedeco/javacpp/Pointer;JJ)V+3
j  org.bytedeco.javacpp.Pointer.init(JJJJ)V+44
v  ~StubRoutines::call_stub
V  [libjvm.dylib+0x2ee9aa]
V  [libjvm.dylib+0x325b59]
V  [libjvm.dylib+0x31b166]
C  [libjniclang.dylib+0x1804]  _ZL19JavaCPP_initPointerP7JNIEnv_P8_jobjectPKvxPvPFvS5_E+0x74
C  [libjniclang.dylib+0x13ef2]  Java_org_bytedeco_javacpp_clang_00024CXCursorVisitor_allocate+0x62
j  org.bytedeco.javacpp.clang$CXCursorVisitor.allocate()V+0
j  org.bytedeco.javacpp.clang$CXCursorVisitor.<init>()V+5
j  com.arcnor.javacpp.Main$Visitor.<init>()V+1
j  com.arcnor.javacpp.Main$Visitor.<init>(Lcom/arcnor/javacpp/Main$1;)V+1
j  com.arcnor.javacpp.Main.visit(Lorg/bytedeco/javacpp/clang$CXTranslationUnit;)V+11
j  com.arcnor.javacpp.Main.main([Ljava/lang/String;)V+22
v  ~StubRoutines::call_stub
V  [libjvm.dylib+0x2ee9aa]
V  [libjvm.dylib+0x3257c2]
V  [libjvm.dylib+0x31e539]
C  [java+0x3931]  JavaMain+0x9c4
C  [libsystem_pthread.dylib+0x393b]  _pthread_body+0xb4
C  [libsystem_pthread.dylib+0x3887]  _pthread_body+0x0
C  [libsystem_pthread.dylib+0x308d]  thread_start+0xd
C  0x0000000000000000

Visitor is a class I created that looks exactly like this:

private static class Visitor extends CXCursorVisitor {
  @Override
  public int call(CXCursor cursor, CXCursor parent, CXClientData client_data) {
    return CXChildVisit_Continue;
  }
}

...and I'm just instantiating by calling new Visitor(). I'm not sure if there are any extra considerations to take when instantiating FunctionPointer classes like this one?

saudet commented 6 years ago

Have you disabled "crash recovery"?

∗ In the case of Clang, we might need to disable crash recovery with the LIBCLANG_DISABLE_CRASH_RECOVERY=1 environment variable to prevent clashes with the JVM's own signal handlers.

https://github.com/bytedeco/javacpp-presets/tree/master/llvm

Arcnor commented 6 years ago

Ahh, nice, will try that. I've had a good run of ~10 without crashes though, so it will be difficult to prove if it worked (unless I get it again :D)

saudet commented 6 years ago

I've added getString() helper methods for CXTUResourceUsageKind and CXEvalResult as well, as per the commit above, but for things like contents and filenames, the encoding can change at runtime. AFAIK there's no pretty way to make String work under those conditions so we might as well just make sure users call BytePointer.getString(). If you have any good ideas though, let me know. Thanks!

saudet commented 6 years ago

Let me know if there's anything else missing from the API that would prevent you from making progress. Thanks!!

Arcnor commented 6 years ago

Sorry for the delay, I read this over Christmas but forgot about it later.

I haven't worked on this much lately, and what I have so far is a separate program that takes a header and tries to parse it using libclang. So far, everything seems to work, and the last thing I did was start working on structure generation.

I'll try to upload it to GitHub at some point, so even if I stop somebody else can continue or use it as reference, although it's Kotlin right now instead of Java.

saudet commented 6 years ago

Sounds great! Thanks

Does it look like we can do all that we need to do with the C API? Or do you already see things that can not be done through the C++ API?

Arcnor commented 6 years ago

For now, it seems we can do everything with this one, although I think I've had to hack a few things but it might just be my memory failing me. I'll let you know as soon as I finish my "demo" (parsing LibTCOD headers is the first thing I do when writing a generator like this. It's C only but it's a good non-trivial example)

Arcnor commented 6 years ago

I need some help with some examples I'm trying to stress the API, I hope you don't mind me posting stuff here directly:

This is a contrived example, and it's probably something that is not a good way of doing things, but as it compiles & works correctly when tested, I'm guessing it should be supported on JavaCPP as well.

  struct StructTwo;

  struct StructOne {
     int *ptr2;
     int val1;

     struct StructTwo *ptr3;
  } StructOneVar;

  struct StructTwo {
     int *ptr2;
     int val1;

     struct StructOne *ptr3;
  } StructTwoVar;

  void passStruct(struct StructOne *s1);

What JavaCPP is generating for that right now is:

@Namespace("StructOne") @Name("StructOneVar.ptr2") public static native IntPointer StructOneVar_ptr2(); public static native void StructOneVar_ptr2(IntPointer StructOneVar_ptr2);
@Namespace("StructOne") @Name("StructOneVar.val1") public static native int StructOneVar_val1(); public static native void StructOneVar_val1(int StructOneVar_val1);

@Namespace("StructOne") @Name("StructOneVar.ptr3") public static native StructTwo StructOneVar_ptr3(); public static native void StructOneVar_ptr3(StructTwo StructOneVar_ptr3);

@Namespace("StructTwo") @Name("StructTwoVar.ptr2") public static native IntPointer StructTwoVar_ptr2(); public static native void StructTwoVar_ptr2(IntPointer StructTwoVar_ptr2);
@Namespace("StructTwo") @Name("StructTwoVar.val1") public static native int StructTwoVar_val1(); public static native void StructTwoVar_val1(int StructTwoVar_val1);

@Namespace("StructTwo") @Name("StructTwoVar.ptr3") public static native StructOne StructTwoVar_ptr3(); public static native void StructTwoVar_ptr3(StructOne StructTwoVar_ptr3);

public static native void passStruct(StructOne s1);

Obviously this won't work, because there is no StructOne or StructTwo defined (because they're not types but anonymous structures), but in C you can still do passStruct(&StructOneVar) and it will compile and work correctly.

Is this something that cannot be modelled in JavaCPP and a workaround must be found, or something we can still support?

Thanks!

saudet commented 6 years ago

No, that should work. It should produce a class definition + a variable declaration for each. That it doesn't means there's a bug somewhere...

saudet commented 6 years ago

@Arcnor How is it looking? Have you encountered any other issues?

Arcnor commented 6 years ago

It's been mainly a general lack of time as always. IIRC I can fully parse a C header now except for a few things:

  1. I still need to solve that question about how anonymous structs with defined instances should be generated.
  2. I can't seem to find a way of getting enough information about C function pointers from the Clang C API (not sure if the C++ API is better, or I'm simply looking in the wrong place). For example, the names of the arguments seem to be lost, and I was fighting to get the real types instead of the resolved ones, but this is all from memory, I need to check again to see what the status was.
saudet commented 6 years ago

Thanks for the insights! Would you have code someplace where I and others could take a look?

Arcnor commented 6 years ago

My code is just a very big hack right now, so I don't want to put that publicly yet :).

I can upload & give you access on Gitlab, are you saudet there?

saudet commented 6 years ago

Don't worry about making it public. JavaCPP is a pretty big hack, and everything else out there is too. :)

In any case, yes, I'm also "saudet" on GitLab. Thanks!

Arcnor commented 6 years ago

Well, you'll see what I mean when you check out the code :)

I've given you access to the repo. Let me know what you think, and if you can help with the issues I'm having, please do! :)

saudet commented 6 years ago

Thanks! As for anonymous structs, there are some instances of that in FFmpeg and OpenCV, for example: https://github.com/bytedeco/javacpp-presets/blob/master/ffmpeg/src/main/java/org/bytedeco/javacpp/avutil.java#L5655

        @Name("data.ptr") public native @Cast("uchar*") BytePointer data_ptr(); public native CvMat data_ptr(BytePointer data_ptr);
        @Name("data.s") public native ShortPointer data_s(); public native CvMat data_s(ShortPointer data_s);
        @Name("data.i") public native IntPointer data_i(); public native CvMat data_i(IntPointer data_i);
        @Name("data.fl") public native FloatPointer data_fl(); public native CvMat data_fl(FloatPointer data_fl);
        @Name("data.db") public native DoublePointer data_db(); public native CvMat data_db(DoublePointer data_db);

https://raw.githubusercontent.com/bytedeco/javacpp-presets/master/opencv/src/main/java/org/bytedeco/javacpp/opencv_core.java

Arcnor commented 6 years ago

Yeah, that works, but just because they don't reference the anonymous struct inside itself, and that's the issue I presented above. If data is of type MyStruct, if you add another field inside its struct of type MyStruct* then the generation is wrong.

So the question is, should we (always?) generate a class for the anonymous struct?

saudet commented 6 years ago

In the example that you gave above, the types are not anonymous, so we can allocate them and they should have a wrapper class, yes. Anonymous structs cannot be allocated independently because they have no name.

Arcnor commented 6 years ago

I'm probably using the wrong terminology, yeah. So this requires a "wrapper" to be created, fine.

Then the next question is, what should this wrapper be named? I'm just asking because the type of struct M {} is not M, but struct M, and there can exist another type called M that is a completely different thing. Should we prefix all structs with Struct for example? (I'd normally check JavaCPP for current behavior, but as I said, right now no classes are being generated in such cases)

saudet commented 6 years ago

JavaCPP is following C++, so just "M", and when there is a conflict, the user can rename it with an Info in the InfoMap.

saudet commented 6 years ago

Hey, I'm finally able to start working on this a bit. Sorry it took so long. I looked at your repository, it's a great start!! Thanks

I've tried to figure out how to get information about the argument names for function pointers, but I haven't found much of anything either. I'm sure there are going to be a lot of cases like that, so we need something general enough to work around those issues. The API gives us a way to get the extents of the declarations from inside the source files themselves. It looks like this:

 val extent = clang_getCursorExtent(cursor)
 val start = clang_getRangeStart(extent)
 val end = clang_getRangeEnd(extent)

 val file1 = CXFile(); val line1 = IntArray(1); val column1 = IntArray(1); val offset1 = IntArray(1)
 val file2 = CXFile(); val line2 = IntArray(1); val column2 = IntArray(1); val offset2 = IntArray(1)
 clang_getSpellingLocation(start, file1, line1, column1, offset1);
 clang_getSpellingLocation(end, file2, line2, column2, offset2);
 println("${clang_getFileName(file1).string} ${line1[0]} ${column1[0]} ${offset1[0]}")
 println("${clang_getFileName(file2).string} ${line2[0]} ${column2[0]} ${offset2[0]}")
 val bytes = Files.readAllBytes(Paths.get(clang_getFileName(file1).string))
 println(String(bytes, offset1[0], offset2[0] - offset1[0]))

And the output looks like this for that function pointer in a typedef:

src/main/resources/complex.h 10 1 97
src/main/resources/complex.h 10 90 186
typedef value_t (*parser_custom_t)(value_t *lex, void *listener, int str, char *propname)

Pretty clean, with all the names and everything. We can then go on parsing that manually possibly with a regex or in some other ad hoc manner, such as striping out the "typedef" part and passing that back to Clang as a function prototype. In any case, it shouldn't be too hard in general given that Clang has usually already done most of the hard work at this point. What do you think of this approach, not just for function pointers but for any other corner case that we encounter?

Arcnor commented 6 years ago

Hey, sorry for the delay.

Good that you can finally work on this! I've been more and more busy lately, so I haven't been able to continue on it (also I'm not using it for the time being due to different reasons).

The parsing sounds good, although it's obviously not as good as having it as part of the API. Maybe we should open some ticket on their tracker to see if it can be done, or to ask for the capability to be added?

In any case, parsing by regex should be fine as long as the output you get from that code is what CLang has processed and not directly from the file (which I guess it is), although even in that case I think it's more foolproof to have a real parser (maybe Clang itself after removing the first part of the output so it thinks it's a normal function declaration?). I'm just worried about edge cases and whatnot, like having some sort of default parameter value or similar (I know this is not possible, but just so you get the idea of what I'm worried about).

I also hope this is the only place where we cannot really use the API, but I can't remember how far my code was from generating usable bindings.

saudet commented 3 years ago

One possibility for the future is to somehow use jextract from Panama for this, but they have no plans to support anything related to C++, and they haven't been able to get any performance gains over JNI, yet, and it's not supported on Android either, so it's not clear how this could be useful for JavaCPP, at this moment. In any case, let's keep this on the radar and see where that goes.

GavinRay97 commented 3 years ago

Just in case it is useful here, some years ago a team at Oracle managed to port the full C++ API of Clang and LLVM to Java. As well as wrote a working C++ to Java translator that could do full programs.

They give a really great LLVM meetup talk about it here:

Clank: Java-port of C/C++ compiler frontend - Vladimir Voskresensky, Oracle & Petr Kudryavtsev, Oracle https://www.youtube.com/watch?v=EpFJlARXO74

From what I understood watching, they tried a bunch of approaches, including converting C++ LLVM bitcode to JVM bytecode, but what ended up working was re-implementing the entirety of the C++ std library, and then writing a C++ -> Java AST tool that preserved semantics.

The RecursiveASTVisitor is here: https://raw.githubusercontent.com/java-port/clank/bcdf3389cd57185995f9ee9c101a4dfd97145442/modules/org.clang.ast/src/org/clang/ast/RecursiveASTVisitor.java

/*template <typename Derived> TEMPLATE*/
//<editor-fold defaultstate="collapsed" desc="clang::RecursiveASTVisitor">
@Converted(kind = Converted.Kind.MANUAL/*a lot of regexp-based replacements, see instruction at the end of file + compilation fixes*/,
 source = "${LLVM_SRC}/llvm/tools/clang/include/clang/AST/RecursiveASTVisitor.h", line = 140,
 FQN="clang::RecursiveASTVisitor", NM="_ZN5clang19RecursiveASTVisitorE",
 cmd="jclank.sh -java-options=${SPUTNIK}/modules/org.clang.ast/llvmToClangType ${LLVM_SRC}/llvm/tools/clang/lib/StaticAnalyzer/Checkers/DebugCheckers.cpp -nm=_ZN5clang19RecursiveASTVisitorE")
//</editor-fold>
public interface/*class*/ RecursiveASTVisitor</*typename*/ Derived extends RecursiveASTVisitor<?>>  {
/*public:*/
  /// A queue used for performing data recursion over statements.
  /// Parameters involving this type are used to implement data
  /// recursion over Stmts and Exprs within this class, and should
  /// typically not be explicitly specified by derived classes.
  /// The bool bit indicates whether the statement has been traversed or not.
  /*typedef SmallVectorImpl<llvm::PointerIntPair<Stmt *, 1, bool> > DataRecursionQueue*/
//  public final class DataRecursionQueue extends SmallVectorImpl<PointerBoolPair<Stmt /*P*/ > >{ };

  /// \brief Return a reference to the derived class.
  //<editor-fold defaultstate="collapsed" desc="clang::RecursiveASTVisitor::getDerived">
  @Converted(kind = Converted.Kind.AUTO,
   source = "${LLVM_SRC}/llvm/tools/clang/include/clang/AST/RecursiveASTVisitor.h", line = 151,
   FQN="clang::RecursiveASTVisitor::getDerived", NM="_ZN5clang19RecursiveASTVisitor10getDerivedEv",
   cmd="jclank.sh -java-options=${SPUTNIK}/modules/org.clang.ast/llvmToClangType ${LLVM_SRC}/llvm/tools/clang/lib/StaticAnalyzer/Checkers/DebugCheckers.cpp -nm=_ZN5clang19RecursiveASTVisitor10getDerivedEv")
  //</editor-fold>
  public default/*interface*/ Derived /*&*/ getDerived() {
    return Native.$star(((/*static_cast*/Derived /*P*/ )(this)));
  }

This is all still function -- I emailed the project lead for this when it was in development at Oracle, and this is what he said:

Hello Gavin,
Yeah it was very cool project.

Before leaving Oracle our team was able to open source it, but not the JConverter itself, so there were no any work after 2017.

just open clank.suite and it should be buildable without any extra settings.
The mentioned env variables:
LLVM_SRC=/path/to/llvm/repo/with/clang3.9
and SPUTNIK=/path/to/repo/with/clank

Were used by one nice plugin which allowed to have navigation between converted sources and original one. As well as reconvert classes/methods from context menu in editor.

Best regards,
Vladimir.

It's a daunting task no doubt, but the fact that it exists is inspiration and proof that it can work 😃

HGuillemet commented 2 years ago

We will have some bootstraping problem if we use a JavaCPP preset in the Parser used to build presets. Won't we ?

I have started to play with the C API of Clang bound by Panama with jextract and it seems to do the job. Preprocessor directives and comments are available. It even parses Doxygen-like syntaxes.

I suggest to rewrite the parser using this API, first to reproduce the current behavior of the parser, as a preliminary step to issue #402. Then we could try to change the parser and generator so that C++ classes are mapped to Java classes that use FMA instead of Pointer.

What do you think of this plan ?

saudet commented 2 years ago

We will have some bootstraping problem if we use a JavaCPP preset in the Parser used to build presets. Won't we ?

Not really, the JavaCPP Presets for LLVM also essentially map the C API only. That's not the problem, the problem is that jextract was designed to work only with C, not C++. It fails miserably at anything that even remotely looks like C++. I think that would be the first thing to "fix" before going forward with that idea.

What do you think of this plan ?

@mcimadamore @sundararajana might have some more recent insights into what they looked at, why it doesn't work, etc.

HGuillemet commented 2 years ago

I meant to use jextract to bind the c clang API only. Then clang can be used to parse C++.Where is the limitation due du jextract ?

saudet commented 2 years ago

jextract also already maps the C API of Clang: https://github.com/openjdk/panama-foreign/tree/foreign-jextract/src/jdk.incubator.jextract/share/classes/jdk/internal/clang

jextract doesn't support C++, period. It never has and probably never will.

HGuillemet commented 2 years ago

Sure, but the C-API of Clang can parse C++.

saudet commented 2 years ago

Yeah, but it's not going to be any better than the JavaCPP Presets for LLVM. You'll get the exact same thing. The only reason you may want to use jextract is to get potentially support from Oracle...

HGuillemet commented 2 years ago

And the bootstrapping ? Once we have switched to the new parser. How would you build the LLVM preset on a new platform ?