graalvm / sulong

Obsolete repository. Moved to oracle/graal.
Other
628 stars 63 forks source link

longjmp/setjmp Not Supported #51

Closed mrigger closed 5 years ago

mrigger commented 8 years ago

We cannot directly call the native implementations of longjmp/setjmp since they implement non-local jumps which manipulate the stack. We will have to implement them through Java intrinsifications.

I refrained from implementing them since they would probably make the interpreter more complicated. However, since longjmp/setjmp can be used to implement exceptions, we still have to support them in the long term.

mulle-nat commented 6 years ago

Just out of interest. Can this be done with Java exceptions ?

rschatz commented 6 years ago

There are two separate issues with longjmp/setjmp. One part is unwinding the stack. That can easily be implemented with Java exceptions. But in addition we need to make sure the function that called setjmp is in a consistent state, i.e. all locals have the expected values. This is hard to do in a fully general way.

Newer versions of LLVM have special bitcode instructions for C++ exceptions, and we already support these (using Java exceptions). So while supporting longjmp/setjmp would definitely be nice, this issue currently has lower priority, unless we find another important use-case.

mulle-nat commented 6 years ago

I thought C doesn't really guarantee that the locals are intact after a longjmp and in general one used volatile to access locals after a longjmp. But then I read the setjmp wiki and read that: Similarly, C99 does not require that longjmp preserve the current stack frame. which to me seems to me even more lenient: Locals can be corrrupted. Or ?

pekd commented 6 years ago

Similarly, C99 does not require that longjmp preserve the current stack frame. which to me seems to me even more lenient: Locals can be corrrupted.

No, that's a different thing. You can save an environment with setjmp, and later call longjmp to return to that point, but that's only valid if longjmp is called from within the same function/a child function. The Wiki text refers to the case where setjmp establishes an environment in a function, and longjmp is performed from a parent function. This is not required to preserve the stack frame → it may not work, because the stack frame of the saved function is already destroyed. In the real world, setjmp saves the program counter, stack pointer and general purpose registers and longjmp just restores them.

volatile has to do with optimizations: if there is a volatile, the memory read/write access cannot be optimized away. Without volatile, the optimizer might decide that some variables are dead and reuse these slots for something different. The optimizer might also decide that it's better to store some locals in registers. After longjmp, these locals would be corrupt or the wrong values, if they were changed in the meantime.

[…] this issue currently has lower priority, unless we find another important use-case.

The problem is that many interesting and slightly bigger real world C programs use setjmp/longjmp somewhere. Examples include many interactive programs (e.g. ed, ex/vi, busybox ash, bash, xterm, …), language interpreters (e.g. perl), compilers (e.g. GNU as), coreutils (e.g. ls, test), man-db, wget and many more.

An experimental implementation for setjmp/longjmp in Sulong already exists.

mulle-nat commented 6 years ago

I dont read it that way. The Wiki text (and my OS X manpage for what it is worth) specifically prohibits calling longjmp from a parent function, so it can't refer to that.

WIKI:

If the function in which setjmp was called returns, it is no longer possible to safely use longjmp with the corresponding jmp_buf object.
rschatz commented 6 years ago

Similarly, C99 does not require that longjmp preserve the current stack frame. This means that jumping into a function which was exited via a call to longjmp is undefined.[6] However, most implementations of longjmp leave the stack frame intact, allowing setjmp and longjmp to be used to jump back-and-forth between two or more functions—a feature exploited for multitasking.

I think this doesn't refer to the stack frame of the setjmp, but to the current stack frame at the time of the longjmp, i.e., the stack before the jump. The "however" seems to say that on some implementations, you can jump out of a function, and then later jump in again, because the unwound stack is still there.

mulle-nat commented 5 years ago

Does closed mean it did/will happen or it won't happen ? I am curious.

mrigger commented 5 years ago

Sorry for not commenting on closing the issue. We are migrating open issues to Sulong's new repo at https://github.com/oracle/graal/. I've opened an issue for setjmp/longjmp at https://github.com/oracle/graal/issues/776.