fungos / cr

cr.h: A Simple C Hot Reload Header-only Library
https://fungos.github.io/cr-simple-c-hot-reload/
MIT License
1.54k stars 101 forks source link

RFC: Re-loadable plugin callbacks and nested plugins. #39

Open Neopallium opened 5 years ago

Neopallium commented 5 years ago

I have started a branch symbols in my repos: https://github.com/Neopallium/cr/tree/symbols

It has added support for:

  1. Re-loadable custom plugin symbols cr_symbol. It is an opaque structure that holds a pointer to a symbol from the plugin. When the plugin reloads, all re-loadable symbols are updated.
  2. Support for safely calling plugin callbacks. cr_closure
  3. Nested setjmp protected calls (Allows nesting of plugins). This is only needed on linux/macOS, windows uses __try/__except which doesn't use a global and "should" be nesting safe?

Limits:

  1. Don't call into the same plugin more then once in a call chain (i.e. Plugin A -> Plugin B -> Plugin A "crash", would cause Plugin A to possible reload while it is still on the call stack.)

TODOS:

New public API:

struct cr_symbol *cr_plugin_get_symbol(struct cr_plugin *ctx, const char *name)

Returns a reference to a re-loadable plugin symbol.

struct cr_closure *cr_plugin_new_closure(struct cr_symbol *symbol, void *userdata)

Used to wrap the plugin's callback and userdata, so it can be safely unwrapped and called from host-side "trampoline" function.

Example (See working example in repos: samples/fake_gui.*, samples/cb_host.*, samples/cb_guest.c):

///// Normal host-side code follows.....

- Guest plugin callback
```c
static struct cr_symbol * CR_STATE g_plugin_mouse_cb = NULL;
static void register_mouse_events(struct cr_plugin *ctx, void *userdata) {
  if (!g_plugin_mouse_cb) {
    // Only need to initialize `g_plugin_mouse_cb` on the first `CR_LOAD`
    g_plugin_mouse_cb = cr_plugin_get_symbol(ctx, "plugin_mouse_events");

    // We can create any number of `cr_closure` as needed and re-use `cr_symbol`
    struct cr_closure *closure = cr_symbol_new_closure(g_plugin_mouse_cb, userdata);
    fake_register_mouse_events(host_trampoline_mouse_events, closure);
  }
}

// Plugin callback for mouse events.
CR_EXPORT int plugin_mouse_events(void *userdata, int x, int y, int buttons) {
  struct cr_plugin *ctx = (struct cr_plugin *)userdata;
  fprintf(stdout, "Guest: mouse event: (%d, %d, 0x%x): ver=%d\n", x, y, buttons);
  return 0;
}

CR_EXPORT int cr_main(struct cr_plugin *ctx, enum cr_op operation) {
  if (operation == CR_LOAD) {
    // Reload/rollback unsafe register of mouse events callback
    //fake_register_mouse_events(plugin_mouse_events, ctx);
    // Reload/rollback safe.
    register_mouse_events(ctx, ctx);  // We are just using `ctx` as our plugin userdata.
  }
  return 0;
}
Neopallium commented 5 years ago

@fungos I have opened this issue to get feedback on the API. Once the TODOs are done and more examples/testing have been done. I will send a PR.

fungos commented 5 years ago

It looks pretty useful and as it does not break backwards compatibility we can consider it, but I would also like the input of other users before taking any commitment to this idea.

A first look at your implementation, it seems pretty reasonable. So, after closing your TODOS and cleaning it up, I would really like to see some imgui sample using the full potential of this feature. I would ask you to implement a sample imgui host with two guests were each guest deals with a imgui window rendering, input, states via closures.

Also, about limits I think this can be checked with safety guards in the ctx in someway.

rokups commented 5 years ago

This is a terrific work!

Call nesting is not exactly optional though. Event systems do cause rather unpredictable callstacks, especially in complex situations.

Did you explore implementing this without adding new APIs? It also introduces need for static storage that probably should be avoided if possible. Or maybe it all could be wrapped under single new api like cr_plugin_make_reloadable_closure() (name sucks, i know) which returns a callback that user can then safely invoke and expect reloading to work? Not really sure, just throwing ideas around.

Writing trampoline functions would be tedious. But that part probably can be automated with some c++ template magic. Or maybe universal trampoline could be written by using va_list.

Neopallium commented 5 years ago

@rokups I have an idea for allowing nesting. Each protected call into the plugin would increment a counter. If counter > 0 skip rollback/reload check. The plugin callback that crashed could be marked as bad until the plugin is rolled back, if it is called again the trampoline would just return.

Static storage is not required, it was just the easiest place to hold the cr_symbol and cr_closure values in the example. Normally I would store those values with the plugin state.

Implementing cr_plugin_make_reloadable_closure() would be tricky. Might be possible, just not easy.

  1. Different callbacks have different function signatures. Even trying to make a universal trampoline would be difficult, because the userdata parameter can be at different places (even stored in a struct).
  2. We can't just load the latest symbol from the plugin and jump to it. Need to setup the crash handling (setjmp/__try) before calling the plugin's callback. Also check for rollback/reload.

It might be possible to dynamically generate a trampoline function for each plugin callback. Either using a simple JIT lib or doing our own code. Using a JIT lib would make it easier to support different arch/ABI.

A generated trampoline would need to do the following:

  1. Save parameters.
  2. Load cr_symbol <--- This value changes for each plugin/callback
  3. Check for rollback/reload -- Just call a helper function for this.
  4. Setup crash handler setjmp/__try
  5. Restore parameters.
  6. Call plugin callback.
  7. Catch logic for crashes.

I will think about it some. But I don't want to make cr to complex.

Neopallium commented 5 years ago

If you didn't want crash handling/reload/rollback for plugin callbacks, then the trampoline would be very simple to generate:

  1. Load cr_symbol <--- embed address in assembly of generated trampoline.
  2. Load address of plugin callback from cr_symbol.
  3. Jump to callback.

A template could be made for each cpu arch/ABI. Just change the cr_symbol and write to a block of executable memory.

Neopallium commented 5 years ago

Some useful links about generating closures:

  1. C closure as a library
  2. See section 'Moving to thunks'
Neopallium commented 5 years ago

Possible solutions to generate safe wrappers for plugin callbacks:

  1. Use an external JIT library: libjit, nanojit, asmjit, xbyak - header only
  2. Use assembly templates for generating trampoline function to wrap each cr_symbol and save parameters. Then trampoline would then call/jump to a C function that would handle crash/rollback/reload, it would then call an assembly function to restore parameters & jump to the plugin callback.

Using a JIT would add an external dependency. I think solution 2 would require the least amount of code. I will see if I can do it with a minimal amount of assembly code.

fungos commented 5 years ago

The closures may be possible with some macro magic, otherwise, its out of the scope of cr making that any easier.

Neopallium commented 5 years ago

@fungos Yes. I found a way to do it without JIT/assembly. I created n_closure to test the idea. Right now it only wraps a function to test if it worked. I have tested on 32/64 linux with gcc, it should work fine on Windows and MacOS. n_closure has a lot of tests for different number of parameter (2, 4 and 6) and different parameter types mixed (char, short, int, long, void *, float, double). n_closure.h is only 79 lines + 100 lines for the slot functions. I can get the line count down for the slot functions.

Limits:

  1. Compile time limit on the number of closures (X trampolines are generated at compile time). cr only needs one closure per plugin symbol to wrap the cr_symbol value.
  2. Will not work for ABI that pass parameters on the stack left-to-right (Pascal calling convention). All C calling conventions that I have found pass stack parameters right-to-left.
  3. Right now it handles up to 6 int/pointer + 6 double parameters (for register passing ABI 64bit) or 12 parameters for stack passing ABI (32bit x86).

Macros are used to compile-time generate slot functions (right now I am generating 100 slots). We need void * and double parameters for register passing ABI like x86_64 to handle double parameters. The order and number of parameters doesn't matter.

typedef void *(*slot_func)(void *p1, void *p2, void *p3, void *p4, void *p5, void *p6, double d1, double d2, double d3, double d4, double d5, double d6);

/* macros used to generated 100 of these functions. */
static void *slot_template_XX(void *p1, void *p2, void *p3, void *p4, void *p5, void *p6, double d1, double d2, double d3, double d4, double d5, double d6) {
  int slot = XX; // <-- Each slot function is given a unique number here.
  return slot_handler(slot, p1,p2,p3,p4,p5,p6,d1,d2,d3,d4,d5,d6);
}

static slot_func g_slot_funcs[MAX_SLOTS] = {
  slot_template_0,
  ....
  slot_template_99,
};
n_closure *g_closure_slots[MAX_SLOTS];

static void *slot_handler(int slot, void *p1, void *p2, void *p3, void *p4, void *p5, void *p6, double d1, double d2, double d3, double d4, double d5, double d6) {
  n_closure *closure = g_closure_slots[slot];
  if (closure && closure->func) {
    return closure->func(p1,p2,p3,p4,p5,p6,d1,d2,d3,d4,d5,d6);
  }
}
Neopallium commented 5 years ago

The n_closure idea works on Linux/OSX/Windows, 32bit and 64bit. Win64 ABI is a pain, while it is working right now there is a restriction on the trampoline function. For Win64 ABI the trampoline function can not do any floating-point ops if a float/double is being passed to the callback. Some assembly can be used to save/restore the XMM0-3 registers, to make it safer.

I have created a new branch closures to test out this idea with cr https://github.com/Neopallium/cr/tree/closures

Right now it doesn't compile on Windows, because the plugins are trying to access exported symbols from the host (which requires special handling on windows). I should be able to change the design so plugins do not need directly link to any symbol in the host.

It should be possible to use macros to mark/wrap plugin functions as reloadable callbacks. Then registering the callback with a GUI library would be just as easy as non-reloadable callbacks.

fungos commented 5 years ago

Awesome work @Neopallium. Your closure lib is something very valuable in itself, very nice to have it standalone.

I'm worried about how complex this is and issues like the one on Win64, this can really create a lot of trouble and support demand.

I'm really fine with simpler and less flexible macros wrapping functions and generating closures without all the work involved in dealing with ABI. Wouldn't something simple like #define CR_CLOSURE(func, args....) be possible, generating the code as in your host_trampoline_mouse_events ?

Neopallium commented 5 years ago

One problem with generating host_trampoline_mouse_events is that it has to be in the host and the plugin will need to get the address when registering callbacks. On linux/mac that isn't a problem, but on Windows the plugin will need to be linked to the host (dll linked to exe).

I am thinking about keeping the "protected closure" logic separate from cr. Only add the basic support needed for re-loadable symbols cr_symbol to cr. Then anyone that needs protected plugin callbacks can include an extension header cr-nclosure.h which can be in a different project.

Neopallium commented 5 years ago

@fungos I am not sure the best way to deal with the DLL linking issues (plugins calling host functions). Since I want to let plugins to get a re-loadable cr_symbol, they need to be able to call at least one host function cr_plugin_get_symbol(). One way to do it would be to store a set of function pointers cr_host_api in the cr_plugin structure, but this is only needed on Windows the other platforms don't need this hack.

fungos commented 5 years ago

Even if that works on other platforms, it does not feel right to do it. The way to do it would be the host to call a register function passing all symbols data to the plugin on the first moment just after reloading.

pixelherodev commented 5 years ago

Ooh! I can actually help with this one! :P

If you use DLLEXPORT on functions in the host, and DLLIMPORT them in the plugin, it should work. I have a separate plugin system that uses this, but I'm rebuilding it over CR.

On that note, if there's absolutely anything I can do to help with this just let me know: it is very useful for me (I'm currently using a slightly patched version of upstream which essentially just adds a get_symbol function), so anything I can do to help with this is more than worth the effort for me.

pixelherodev commented 5 years ago

On Linux, nothing is needed, and on OS X, the -undefined dynamic_lookup flag is required IIRC.

Neopallium commented 5 years ago

@pixelherodev Don't you still need to link the plugins to the host?

pixelherodev commented 5 years ago

I don't think so, but I can double check; I'm working on a project utilizing this right now (plugin calling host functions)

pixelherodev commented 5 years ago

Nope, at least on Linux you don't need to link to the host.

Edit: same goes for Emscripten.

99% sure it's true for Windows too, but completely unsure about OS X (never actually had a chance to test it as I don't have a mac but I was able to build it using Travis without issues).

pixelherodev commented 5 years ago

Suggestion: since any symbol can now be called, maybe add an option to disable automatically calling cr_main (or a renamed version using the #define) in cr_update? That way, ... oh wait, you added cr_check. Never mind, nice work!

Using your branch, and so far so good!

pixelherodev commented 5 years ago

For consistency, shouldn't CR_CLOSURE_CALL_END be defined as CR_CLOSURE_CALL_END()? That way, it's used as a function, which makes it look a bit cleaner, as it's invoked via CR_CLOSURE_CALL_END(); instead of CR_CLOSURE_CALL_END.

pixelherodev commented 5 years ago

Another issue with this is that closures have to be the only thing inside of a function and have to return some sort of integer.

You can't, with this, do

if (...) {
   CLOSURE...
}
else {
   ...
}

... do more using info retrieving through whatever method

because if the closure fails there's no way to recover and it instantly returns -1.

Maybe instead of returning -1, define a value (bool cb_failed) that's set to true so that error handling is easier and closures become a bit more capable?

pixelherodev commented 5 years ago

Regarding the A calls B calls A crashes problem, how about detecting attempts to call a plugin that's already on the stack and refusing unless explicitly enabled? That might be encroaching on "too much overhead for little gain" territory though.