compiler-research / CppInterOp

A Clang-based C++ Interoperability Library
Other
38 stars 20 forks source link

Support multiple interpreter instance? #302

Open Gnimuc opened 4 weeks ago

Gnimuc commented 4 weeks ago

https://github.com/compiler-research/CppInterOp/blob/ecbffafe0016022e40b7dea6e7197ee062dde38f/lib/Interpreter/CppInterOp.cpp#L58-L73

The interpreter is global and unique at the moment. However, clients do not always agree on the same compiler configuration. I think we need to add more APIs that take compat::Interpreter as argument. For example,

const char* GetResourceDirFromInterpreter(const compat::Interpreter* interp) {
    return interp->getCI()->getHeaderSearchOpts().ResourceDir.c_str();
}
vgvassilev commented 4 weeks ago

We have thought about that and initially this was part of the design. However, this makes the interfaces quite ugly and we moved away from that approach.

Perhaps, we could have some list of interpreters and we could change the "active" interpreter. Would that work for your use case?

Gnimuc commented 4 weeks ago

That could be a solution as well. At least, we need a setInterp() API to allow clients to switch the current interpreter. But it's tricky to do it right in a multi-threading context...

vgvassilev commented 4 weeks ago

I agree... Do you have an implementation in mind?

vgvassilev commented 4 weeks ago

I suspect we want to replace the static interpreter instance with an user-provided callback or something like that where the user is responsible for maintaining the top of the interpreter stack?

Gnimuc commented 3 weeks ago

I agree... Do you have an implementation in mind?

No. I'm still experimenting with the library. I'm wondering how cppyy solves this problem.

Another way is to expose a C wrapper over the internal CppInterOp Interpreter, so users can manage those instances on their own. We only need to maintain a stable C API. The CppInterOpInterpreter.h can still be internal.

The following C++ APIs are just helper functions or wrappers over CppInterOpInterpreter.h's public methods. They are simple and short, so the C API version(with the ugly interpreter argument) can be easy to reimplement.

void AddSearchPath(const char *dir, bool isUser, bool prepend);
const char* GetResourceDir();
void AddIncludePath(const char *dir);
int Declare(const char* code, bool silent);
int Process(const char *code);
intptr_t Evaluate(const char *code, bool *HadError/*=nullptr*/);
std::string LookupLibrary(const char* lib_name);
bool LoadLibrary(const char* lib_stem, bool lookup);
void UnloadLibrary(const char* lib_stem);
std::string SearchLibrariesForSymbol(const char* mangled_name, bool search_system /*true*/);
std::string ObjToString(const char *type, void *obj);
void Destruct(TCppObject_t This, TCppScope_t scope, bool withFree /*=true*/);
TCppFuncAddr_t GetFunctionAddress(const char* mangled_name);
CPPINTEROP_API JitCall MakeFunctionCallable(TCppConstFunction_t func);

However, these two APIs are not trivial to reimplement.

intptr_t GetVariableOffset(TCppScope_t var); 
bool InsertOrReplaceJitSymbol(const char* linker_mangled_name, uint64_t address);

I think we could do what MakeFunctionCallable did here:

https://github.com/compiler-research/CppInterOp/blob/6a5f5649d2549337456391f672749213118a5b50/lib/Interpreter/CppInterOp.cpp#L2353

In this way, the C++ interfaces can be kept concise and clean. The new C interfaces are born to be ugly. :)

vgvassilev commented 3 weeks ago

I agree... Do you have an implementation in mind?

No. I'm still experimenting with the library. I'm wondering how cppyy solves this problem.

Cppyy is not know to have good interfaces to communicate with the underlying infrastructure. It wraps them into multiple layers via the CPyCppyy and the cling-backend. Let's ping @wlav for a better take on this.

Another way is to expose a C wrapper over the internal CppInterOp Interpreter, so users can manage those instances on their own. We only need to maintain a stable C API. The CppInterOpInterpreter.h can still be internal.

Yes, right now CppInterOp is not in a good place in that respect. The intent has been that these interfaces to be C compatible and to not change a lot. However, it is really challenging to provide C-compatible interfaces when we deal with collections (std::vectors) or provide storage.

The following C++ APIs are just helper functions or wrappers over CppInterOpInterpreter.h's public methods. They are simple and short, so the C API version(with the ugly interpreter argument) can be easy to reimplement.

void AddSearchPath(const char *dir, bool isUser, bool prepend);
const char* GetResourceDir();
void AddIncludePath(const char *dir);
int Declare(const char* code, bool silent);
int Process(const char *code);
intptr_t Evaluate(const char *code, bool *HadError/*=nullptr*/);
std::string LookupLibrary(const char* lib_name);
bool LoadLibrary(const char* lib_stem, bool lookup);
void UnloadLibrary(const char* lib_stem);
std::string SearchLibrariesForSymbol(const char* mangled_name, bool search_system /*true*/);
std::string ObjToString(const char *type, void *obj);
void Destruct(TCppObject_t This, TCppScope_t scope, bool withFree /*=true*/);
TCppFuncAddr_t GetFunctionAddress(const char* mangled_name);
CPPINTEROP_API JitCall MakeFunctionCallable(TCppConstFunction_t func);

I think it is really better to provide a C interface. That's why we will be safe.

However, these two APIs are not trivial to reimplement.

intptr_t GetVariableOffset(TCppScope_t var); 
bool InsertOrReplaceJitSymbol(const char* linker_mangled_name, uint64_t address);

I think we could do what MakeFunctionCallable did here:

https://github.com/compiler-research/CppInterOp/blob/6a5f5649d2549337456391f672749213118a5b50/lib/Interpreter/CppInterOp.cpp#L2353

In this way, the C++ interfaces can be kept concise and clean. The new C interfaces are born to be ugly. :)

I agree. Once you are beyond the experimental part of this we can discuss about what are the needs. One thing I'd like to point out is that in an ideal world a lot of this code will land in llvm-project/clang/lib/Interpreter and perhaps there is where we could use C style interfaces, too...

wlav commented 3 weeks ago

Preface with a note that the multi-interpreter case is a common one, especially if they could be spun up quickly, or "forked" from the state of a parent interpreter. The latter would be awesome for writing C++ unit tests.

Cppyy is not know to have good interfaces to communicate with the underlying infrastructure.

Well, it worked for 3 applications so far. :) CPyCppyy, PyPy/_cppyy, and the D language bindings. There have been a few others that I'm aware of, e.g. an NLP application, but AFAIK, these were one-offs grad student projects.

Just that I have misgivings about it b/c it's been grown organically and even as that hasn't limited functionality, there are several inefficiencies that I don't think are necessary.

It wraps them into multiple layers via the CPyCppyy and the cling-backend. Let's ping @wlav for a better take on this.

No, CPyCppyy isn't part of it. The layerings are clingwrapper and ROOT/meta. For the former, the above message applies, the latter, well ... :}

We only need to maintain a stable C API. [..] it is really challenging to provide C-compatible interfaces when we deal with collections (std::vectors) or provide storage.

Right, that's how clingwrapper started as well: to bootstrap C++ bindings into PyPy, we had to start with C bindings and that's where several of the inefficiencies stem from. Eg., as you say, it's hard to share a container with C (esp. since the C binding in PyPy is ABI-based), so that's in the current implementation, there are several places where it loops with an external index over an internal data structure.

In this way, the C++ interfaces can be kept concise and clean. The new C interfaces are born to be ugly. :)

I did the same later: re-implemented the C interfaces on top of the C++ ones. However, it's still mostly 1-to-1, the C wrappers taking care of some amount of memory handling.

wlav commented 3 weeks ago

Preface with a note that the multi-interpreter case is a common one, especially if they could be spun up quickly, or "forked" from the state of a parent interpreter. The latter would be awesome for writing C++ unit tests.

Cppyy is not know to have good interfaces to communicate with the underlying infrastructure.

Well, it worked for 3 applications so far. :) CPyCppyy, PyPy/_cppyy, and the D language bindings. There have been a few others that I'm aware of, e.g. an NLP application, but AFAIK, these were one-offs grad student projects.

Just that I have misgivings about it b/c it's been grown organically and even as that hasn't limited functionality, there are several inefficiencies that I don't think are necessary.

It wraps them into multiple layers via the CPyCppyy and the cling-backend. Let's ping @wlav for a better take on this.

No, CPyCppyy isn't part of it. The layerings are clingwrapper and ROOT/meta. For the former, the above message applies, the latter, well ... :}

    We only need to maintain a stable C API.
    [..]
    it is really challenging to provide C-compatible interfaces when we deal
    with collections (std::vectors) or provide storage.

Right, that's how clingwrapper started as well: to bootstrap C++ bindings into PyPy, we had to start with C bindings and that's where several of the inefficiencies stem from. Eg., as you say, it's hard to share a container with C (esp. since the C binding in PyPy is ABI-based), so that's in the current implementation, there are several places where it loops with an external index over an internal data structure.

    In this way, the C++ interfaces can be kept concise and clean. The new C interfaces are born to be ugly. :)

I did the same later: re-implemented the C interfaces on top of the C++ ones. However, it's still mostly 1-to-1, the C wrappers taking care of some amount of memory handling.