fungos / cr

cr.h: A Simple C Hot Reload Header-only Library
https://fungos.github.io/cr-simple-c-hot-reload/
MIT License
1.54k stars 101 forks source link

Rollback is unsafe in complex applications #27

Closed rokups closed 4 years ago

rokups commented 5 years ago

cr is built around a notion that plugin may crash only when executing cr_main(). Thing is any complex system with plugin does not work that way.

For example imagine we have a GUI application where plugin provides some GUI controls. When plugin is loaded it would somehow register itself with the system in place, then system runs it's main loop and calls into plugin code all without need of cr_main(CR_STEP) step. This works good so far.

Problem arises when plugin code crashes. Crash handler will try to jump to context saved by last plugin's cr_main(CR_STEP) invocation. This is not correct and environment may be no longer valid (ctx may be gone for example).

This somehow needs to be addressed. In my own fork i added CR_ROLLBACK macro which set to 0 simply disables rollback mechanism and allows application to crash. This is quick and dirty solution. I am interested if this can be solved proper. Any ideas?

Neopallium commented 5 years ago

@rokups Do you mean that the plugin registers a callback (event handler) with the GUI system and the crash happens in side the plugin's callback?

For rollback to work the GUI event loop/polling would need to happen inside the cr_main(CR_STEP) call. You could pass a HostAPI to the plugin and have the plugin "step" the GUI event loop. This will work if the app only has one plugin with callbacks.

rokups commented 5 years ago

This will work if the app only has one plugin with callbacks.

This is exactly the problem. Even more complex scenarios may arise. For example what if plugin A implements ControlA, plugin B makes use of ControlA and triggers faulty event. Event if it were somehow possible to implement this the right way - CR_STEP of plugin B would cause crash in plugin A. Wrong plugin would be rolled back.

Neopallium commented 5 years ago

I think it would be difficult to handle crash recovery/rollback with multiple plugins. The simple solution is to disable crash recovery & rollback like you did. Just let the whole application crash.

cr can still be used for plugin loading and live reloading. Just not safe for live code editing of multiple plugins.

One way to do crash recovery from plugin callbacks is to catch the crash inside the callback.

/* Each plugin would need it's own static ctx variable. */
static cr_plugin * plugin_ctx = NULL;

int plugin_x_gui_callback(gui_event *evt, void *userdata) {
  cr_plugin *ctx = plugin_ctx; /* Or get the plugin ctx from the callback's `userdata` */
  CR_PLUGIN_TRY(ctx);
  /* do callback work. */
  CR_PLUGIN_CATCH(ctx);
  return 0;
}

CR_EXPORT int cr_main(struct cr_plugin *ctx, enum cr_op operation) {
  switch(operation) {
  case CR_LOAD:
    plugin_ctx = ctx;
    break;
  case CR_UNLOAD:
  case CR_CLOSE:
    plugin_ctx = NULL;
    break;
  }
}

CR_PLUGIN_TRY(ctx) and CR_PLUGIN_CATCH(ctx) would use either setjmp/longjmp (unix) or try/catch (windows). The cr_plugin struct is only allocated once when the plugin is first loaded, so the pointer doesn't change. You could also use CR_STATE on the static variable and let cr save/restore it.

fungos commented 5 years ago

Yes, trying to automatically handling crashes in complex environments like this is complex and maybe not really practical inside the plugins. It is way simpler to do this in the application level in a centralized way. @Neopallium suggestion looks like a valid alternative.

From cr perspective, I think we need to keep things really simple. The multiple plugin support was a bit out of scope, but I conceded it is really useful to have overall. With that in mind, I think the best here is really to have an option to disable the handling like you did, @rokups.

If you're willing to do a simple PR with this, it would be awesome.

Neopallium commented 5 years ago

@rokups I have working code for safely handling callbacks from multiple plugins. See issue #39 and the code at https://github.com/Neopallium/cr/tree/symbols

It might do what you need.

The only limit is that a crashing/reloading plugin can only be in the call stack once. It is not possible to handle a crash if the call stack is like Plugin A -> Plugin B -> Plugin A (crash/reload here). Since Plugin A is on the call stack twice. The only way I see to handle this would be to keep a counter in cr_plugin for how many protected plugin calls are currently open. When the plugin crashes, then reload/rollback would be blocked until the counter is zero. Also new calls into the crashed plugin could be blocked.

fungos commented 4 years ago

Closing due inactivity, thank you.