e4c does not support OpenMP

GoogleCodeExporter commented 9 years ago

Simple OpenMP examples will crash:

e4c_using_context(E4C_FALSE) {
  #pragma omp parallel for
  for (int i = 0; i < 10; i++) {
    try {} finally {};
  }
}

The problem is likely that e4c relies on the pthreads API to manage 
environments per thread, but this API does not seem to be supported by OpenMP. 
This is too bad, because OpenMP is a very popular API!

Note that e4c_reusing_context won't work in OpenMP, either.

Original issue reported on code.google.com by tal.liron on 31 May 2014 at 2:34

GoogleCodeExporter commented 9 years ago

Actually, there is a workaround of sorts... you need to disable thread-safe 
support in e4c. Unfortunately, this is not so easy is pthreads is enabled at 
the compiler level, because e4c.h will throw an error, demanding that you 
enable E4C_THREADSAFE. Hacking e4c.h to remove this error message allows me to 
use OpenMP to an extent. But intrinsic support in e4c would of course be much 
better.

Original comment by tal.liron on 31 May 2014 at 2:48

GoogleCodeExporter commented 9 years ago

Yep, that's right; currently exceptions4c supports pthreads only. However, it 
shouldn't be difficult to add OpenMP support (no need to disable thread-safe 
support).

When E4C_THREADSAFE is defined, the library uses these macros:

Thread-related macros:

 * THREAD_TYPE
 * THREAD_CURRENT
 * THREAD_SAME
 * THREAD_CANCEL_CURRENT
 * THREAD_EXIT

Mutex-related macros:

 * MUTEX_DEFINE
 * MUTEX_LOCK
 * MUTEX_UNLOCK

At this time, they are defined based on pthreads; We can redefine them based on 
OpenMP (provided that OpenMP has similar functionality).

I will do some research and try to implement this enhancement :)

Original comment by guillermocalvo on 31 May 2014 at 3:03

Added labels: Type-Enhancement
Removed labels: Type-Defect

GoogleCodeExporter commented 9 years ago

OpenMP has some of this functionality, but not all of it:

You can find the current thread ID using omp_get_thread_num(). But you can't 
otherwise explicitly manipulate threads via API calls.

For mutexes, see omp_init_lock() and omp_init_nest_lock().

The API is really very bare! It's designed specifically to be very high level, 
allowing compilers freedom to do their own implementations.

Original comment by tal.liron on 31 May 2014 at 3:11

GoogleCodeExporter commented 9 years ago

I see... so how is the program supossed to cancel/exit current thread? Simply 
by calling exit()?

Original comment by guillermocalvo on 31 May 2014 at 3:23

GoogleCodeExporter commented 9 years ago

Simply by reaching the end of the code. :) A "goto" would do the trick.

I suggest you take a look of what OpenMP does, you'll realize the concept of 
execution threads is a bit different from what you expect:

http://bisqwit.iki.fi/story/howto/openmp/

Original comment by tal.liron on 31 May 2014 at 3:30

GoogleCodeExporter commented 9 years ago

Oh, then we must find a way to mix OpenMP/try-catch semantics...

I mean, when you throw an exception, you expect the program/thread to stop if 
the exception is not caught, don't you?

How could we do that if the only way to stop the normal execution control flow 
is by reaching the end of the code? Kind of hard, isn't it?

Original comment by guillermocalvo on 31 May 2014 at 3:40

GoogleCodeExporter commented 9 years ago

I agree. There might be a need for another macro at the end of the code 
segment, to allow for a "goto":

e4c_using_context(E4C_FALSE) {
  #pragma omp parallel for
  for (int i = 0; i < 10; i++) {
    try { throw(RuntimeException, "Oops"); } finally {};
    e4c_omp_end();
  }
}

Original comment by tal.liron on 31 May 2014 at 3:44

GoogleCodeExporter commented 9 years ago

Even so, you wouldn't be able to prevent "foo(10)" from executing here:

int foo(int bar){

  if(bar == 0){
    throw(RuntimeException, "Oops");
  }

  return(1);
}

...
e4c_using_context(E4C_FALSE) {
  #pragma omp parallel for
  for (int i = 0; i < 10; i++) {
    int foobar = foo(100) && foo(0) && foo(10); /* foo(10) should never be executed */
    e4c_omp_end();
    printf("This should never be printed.\n");
  }
}
...

There should be another way...

Original comment by guillermocalvo on 31 May 2014 at 3:59

GoogleCodeExporter commented 9 years ago

I can't think of one... OpenMP has intrinsic C++ support, where it of course 
supports C++'s exception unwinding. But for C there is simply no equivalent.

There is a chance that OpenMP support for exceptions would not be possible 
without a very awkward syntax.

However, I'm hoping that at least try/finally can be made to work.

Original comment by tal.liron on 31 May 2014 at 4:30

GoogleCodeExporter commented 9 years ago

Some more thoughts...

A solution can be to support a new "nested" e4c context:

e4c_using_context(E4C_FALSE) {
  #pragma omp parallel for
  for (int i = 0; i < 10; i++) {
    e4c_using_nested_context() {
      try { throw(RuntimeException, "Oops"); }
      catch { ... };
    }
    // code here will be executed even if an exception was uncaught above
  }
}

What the macro could do is create a special e4c context, which is unrelated to 
the "normal" e4c context that wraps the above code. (This will allow you to 
nest the OpenMP context without breaking regular e4c code: note that OpenMP 
will be using the current thread as part of its workload.)

Instead of ending the thread, you can jump to the end of the 
e4c_using_nested_context() macro. Thus, that internal context follow the 
familiar e4c semantics. If there's code after that macro, of course it will 
continue to be executed, but the user would expect that because the context is 
closed.

The throw() will recognize if it's in a special nested context and simply use 
that environment instead of "normal" e4c environment.

Now, there is one tricky problem: what to do with exceptions that are not 
caught in the nested context? Example:

e4c_using_context(E4C_FALSE) {
  e4c_context_handle parent = e4c_get_current_context_handle();
  #pragma omp parallel for
  for (int i = 0; i < 10; i++) {
    e4c_using_nested_context(parent) {
      try { throw(RuntimeException, "Oops"); }
      catch { ... };
    }
  }
  e4c_throw_nested();
}

As you can see, this requires some kind of API to expose the parent context so 
that it can be shared between all OpenMP threads. Uncaught nested exceptions 
will be stored in a special thread-safe list in the parent (if it's provided).

Now, we need to get these exceptions back to the parent context. That's what 
the e4c_throw_nested() call is for: it will go through these one by one and 
throw them as usual! So, the nested exception will be supported as usual in the 
"normal" context. (What to do if there's more than one uncaught nested 
exception? I'm not sure: we can throw just the first one and provide an API for 
users to get all nested exceptions if they want to traverse them.)

I know it's a bit clumsy to use, but I think this can work, and may actually 
allow all kinds of specialized "nested" mechanisms.

Original comment by tal.liron on 31 May 2014 at 6:10

GoogleCodeExporter commented 9 years ago

On closer look, there can't be a generalized nested context macro because it 
requires thread-local storage to work, and the API differs per implementation. 
In OpenMP you need to use the "threadprivate" directive, which applies to 
global vars:

e4c_context nested_context;
#pragma omp threadprivate(nested_context)

As usual, OpenMP is very cool in that it will automatically generate 
thread-local code for this everywhere. But this can't easily be generalized for 
other threading libraries, which would likely require API calls. So I suggest 
feeding the macro a function that would return the nested context:

e4c_context get_omp_nested_context() {
  return nested_context; // omp will automatically make this thread-local due to the pragma
}

e4c_using_nested_context(parent, get_omp_nested_context) {
...

Now you can feed it a different function to use in different kinds of nested 
situations.

Original comment by tal.liron on 1 Jun 2014 at 5:09

GoogleCodeExporter commented 9 years ago

I've been documenting about OpenMP and found out that this library does not get 
along very well with C++ exception system either: 
<http://bisqwit.iki.fi/story/howto/openmp/#SomeSpecificGotchas>

In my opinion, OpenMP is designed as a simple API which provides concurrency 
without worrying about thread details. I think the most coherent solution would 
be to use OpenMP and **exceptions4c lightweight version** together.

I've modified the lightweight version to provide concurrent exception contexts 
in a very simple way. This version can be configured by defining 
`E4C_MAX_CONTEXTS` (maximum number of concurrent exception contexts supported) 
and `E4C_CONTEXT_NUMBER` (a macro that retrieves the current context number).

The attached example works with the following compiler options:

 * -fopenmp
 * -DE4C_MAX_CONTEXTS=4
 * -DE4C_CONTEXT_NUMBER=omp_get_thread_num()

Please check it out and let me know what you think :)

Original comment by guillermocalvo on 1 Jun 2014 at 4:08

Attachments:

GoogleCodeExporter commented 9 years ago

Well, for me robust OpenMP is critical. So, what I've done is rolled my own 
exception library... It's in many ways much simpler than e4c, but it has all 
the features I need. (Also not very portable: requires c99, and probably gcc, 
too.) And it does support OpenMP entirely, though with a few extra macros such 
I mention in my suggestions above.

Let me try to stabilize it, and I'll be happy to send you the code! However, 
the structure is very different from e4c: indeed, I designed from the start 
exactly to be able to support OpenMP. (Though it also supports pthreads and SDL 
threads, and allows you to "relay" from one storage scope to another.)

Original comment by tal.liron on 1 Jun 2014 at 4:17

GoogleCodeExporter commented 9 years ago

I thought it might be useful for you, in the meantime, to see how my code looks 
(and works):

static int main_thread(void *) {
  with_exceptions(sdl) {
  try {
    try {
      sub();
    }
    catch(10) {
      printf("caught: %s, %s, %d\n", exception.message, exception.location.file, exception.location.line);
    }
    finally {
      printf("finally1\n");
    }
    printf("outside1\n");
  }
  finally {
    printf("finally2\n");
  }
  printf("outside2\n");
  printf("uncaught exceptions: %d\n", gluon_exception_count());
  }

  return 0;
}

static void sub() {
  with_exceptions_relay(openmp, sdl) {
  #pragma omp parallel for
  for (int i = 0; i < 5; i++) {
    capture {
      if (i % 2 == 0)
        throwf(10, "error2 in loop %d", i);
      printf("loop %d\n", i);
    }
    printf("outside3\n");
  }
  throw_captured();
  }
}

int main(int argc, char *argv[]) {
  initialize_exceptions(posix);
  initialize_exceptions(sdl);

  int r = 0;
  SDL_Thread *thread1 = SDL_CreateThread((SDL_ThreadFunction) main_thread, "Main1", null);
  SDL_Thread *thread2 = SDL_CreateThread((SDL_ThreadFunction) main_thread, "Main2", null);
  SDL_Thread *thread3 = SDL_CreateThread((SDL_ThreadFunction) main_thread, "Main3", null);
  SDL_WaitThread(thread1, &r);
  SDL_WaitThread(thread2, &r);
  SDL_WaitThread(thread3, &r);

  shutdown_exceptions(global);
  shutdown_exceptions(posix);
  shutdown_exceptions(openmp);

  return r;
}

The big difference between my code and e4c is that you always need to use 
"with_exceptions" in the local scope, even if you just want to throw(). The 
reason for this is that the macro creates a scope variable locally through 
which it can access the exception context, but indeed each technology has its 
own way of retrieving the context, and there's simply no way to find this out 
dynamically.

So, for example you see that main_thread() has "with_exceptions(sdl)", meaning 
that the execution context will be retrieved according to an SDL thread-local 
storage, guaranteeing a unique context per thread. Same with POSIX and OpenMP.

One interesting feature is in sub(): there you see 
"with_exceptions_relay(openmp, sdl)". This means that the block instead uses 
OpenMP thread-local storage to get the context, however at the end of the block 
it will relay all uncaught exceptions to the SDL-retreived context.

More on topic, "capture" gathers all uncaught exceptions and stores them in the 
local scope. These means that exceptions from all the OpenMP threads in the 
"#pragma omp parallel" section will be there after the for-loop completes. And 
then throw_captured() makes sure to throw them (actually, it will only throw 
the first one it finds, because each thread can create an uncaught exception, 
but our semantics only know how to handle one).

I wanted to point out the printf("outside3\n"). As you can see, it's outside 
the "capture", so that code is executed even if an exception is thrown. This is 
relevant to our earlier discussion on how to exit OpenMP threads: I'm not 
exiting them. Actually, I'm not exiting from POSIX or SDL threads, either: 
uncaught exceptions are left dangling. If the user wants to handle all 
exceptions, there must be an uppermost "catch" somewhere. (There is also an API 
to see if there are any uncaught exceptions in the current context.)

Anyway, my API is probably a too cumbersome for what you want with e4c: the 
requirement of adding "with_exceptions" everywhere is surely too much for a 
general-purpose library. However, this example might give you some more ideas 
on how to implement OpenMP support and integrate it other threading libraries.

Original comment by tal.liron on 1 Jun 2014 at 6:36

GoogleCodeExporter commented 9 years ago

It turns out that OpenMP support for lightweight version was way much simpler 
than I had thought:

    #pragma omp threadprivate(e4c)

That's it! Everything works as expected now.

This directive gives each thread its own copy of the "global" exception context 
(e4c).

Most likely I will update exceptions4c (standard version) in order to provide 
OpenMP support, using a similar solution.

Original comment by guillermocalvo on 5 Jun 2014 at 6:06

Changed state: Started

GoogleCodeExporter commented 9 years ago

Unfortunately, it's not so simple. :)

It's true that it will give individual OpenMP threads their own context, but 
the way OpenMP is used is quite different: you usually have certain sections 
within single- threaded code that are marked as parallel. Your usage does all 
for these sections to have their internal try/catch, but thrown exceptions in 
side the parallel section cannot be "relayed" to the main context without some 
trickery.

Also note that OpenMP implementations use the current thread as one of the 
threads for the parallel section, so that one thread would actually be using 
the context of the main section outside of the parallel code.

I really don't think you can get around it without a specialized mechanism...

I'll be publishing my own code soon (MIT-license) and will be more than happy 
to contribute any of my solutions to e4c, if they are applicable. Still dealing 
with some bugs regarding rethrown exceptions... thinking in jump points is 
hard. :)

Original comment by tal.liron on 5 Jun 2014 at 6:12

GoogleCodeExporter commented 9 years ago

I don't believe this solution (based on threadprivate) is cumbersome, really:

#define KEEP_CALM(EXCEPTION) (exception = *EXCEPTION, 1)

void foo(){

    volatile e4c_exception exception = {0}; /* last exception thrown (if any) */
    volatile int error_flag = 0;            /* prevents thread from working */
    volatile int number;                    /* iteration index */

    printf("BEGIN{\n");

    #pragma omp parallel for private(number) firstprivate(error_flag) num_threads(4)
    for(number = 0; number < 20; number++){

        if(error_flag) continue;

        /* each thread gets its own exception context */
        e4c_reusing_context(error_flag, E4C_ON_FAILURE(KEEP_CALM) ){

            foobar(number); /* foobar might throw exceptions */
        }
    }

    printf("}END\n");

    /* check if an exception was thrown */
    if(exception.name){

        printf("Last error was: %s (%s)\n", exception.name, exception.message);
    }
}


This code is currently working on my test version of the library. Not that 
"tricky", is it?

Advantages:

 - Each thread gets its own exception context, for its entire lifetime.
 - You can make sure that no exceptions will "slip out" of the thread by "reusing" the context.
 - Keeping track of the last exception thrown is simple, and has a small overhead.

Of course, supporting pthreads, SDL and OpenMP, all at once, is a different 
story and probably comes at a cost (in terms of simplicity and portability) 
that my little library is unlikely to be able to afford.

Anyway I'm looking forward to your library, I'm pretty certain that it will 
give me a deeper insight into these systems and I hope to provide a better 
support for them in exceptions4c. So, please, keep me posted :)

Original comment by guillermocalvo on 5 Jun 2014 at 8:08

Attachments:

main.c

GoogleCodeExporter commented 9 years ago

It's not too bad, I agree, and that's pretty much the recommended method to use 
OpenMP with C++ exceptions. Still, in my opinion it's not very elegant: the 
whole point of using exceptions is that we want to get rid of antique error 
flags. :)

I just published my library, you can see how I attempted to make things a bit 
more transparent:

https://github.com/tliron/exceptional-c-exceptions

Original comment by tal.liron on 7 Jun 2014 at 8:52

GoogleCodeExporter commented 9 years ago

Hey, I just want to warn you to look close at this:

#pragma omp threadprivate(e4c)

Actually, I was trying something similar and it seemed to work some of the time 
... but sometimes I got segfaults. The issue is that threadprivates are 
initialized with arbitrary data, and OpenMP offers no facility for initializing 
them. Even worse, there's no way to destroy them when your program ends. So, 
actually they are quite different from TLS in other threading libraries. They 
end up being far less useful than first appears.

A solution that works for proper initialization/destruction is using a regular 
"private" inside a "parallel" section that would wrap all your code:

http://stackoverflow.com/a/2353129/849021

So, it gets more complicated quickly ... I'm still trying to find the best 
solution to this problem.

Original comment by tal.liron on 8 Jun 2014 at 8:28

GoogleCodeExporter commented 9 years ago

Ah, so I found a stable solution:

1) You can put self-contained "omp parallel" at the beginning and end of your 
usage to initialize and destroy the theadprivates for every thread in the team. 
For e4c, you may want to encapsulate these into e4c_using_context.

2) I discovered an interesting quirk: in gcc/linux, at least, a separate OpenMP 
thread team is created per POSIX thread. So if you're mixing OpenMP and POSIX, 
you have to make sure the initialize/destroy the team within the thread 
function.

I'm running a rather large application that does a lot of mixing of POSIX 
(well, via SDL wrappers) and OpenMP sections, and this solution is working 
great! Exceptions thrown in a parallel section get unwound all the way to the 
main function.

Original comment by tal.liron on 8 Jun 2014 at 10:58

GoogleCodeExporter commented 9 years ago

Thanks for the heads-up!

So far I have not stumbled upon any segfaults. In fact, I was expecting 
`threadprivate` variables were *properly* initialized. According to the specs, 
the threadprivate directive must do that:

> Each copy of a threadprivate variable is initialized once, in the manner 
specified by the 
program, but at an unspecified point in the program prior to the first 
reference to that 
copy.

I believe that, as long as I don't leave the "global" exception context 
uninitialized, every thread gets its own exception context ready to work:

#pragma omp threadprivate(e4c)
...
struct e4c_context e4c = {0};


Regarding `threadprivate` destruction, the specs state:

> The storage of all copies of a threadprivate variable is freed according to 
how 
static variables are handled in the base language, but at an unspecified point 
in the 
program.

As for the lightweight version of exceptions4c, I don't need to destroy the 
exception context (since no dynamic allocation is used), so it shouldn't be an 
issue.

Anyhow I will spend more time on testing to make sure `threadprivate` variables 
work the way they're supposed to.

By the way, I haven't yet had time to check your library thoroughly, but it 
looks really good :)

Original comment by guillermocalvo on 8 Jun 2014 at 2:36

GoogleCodeExporter commented 9 years ago

Ah, I see, I indeed didn't realize that the lightweight version is so different.

Please don't feel that I created my library to compete with yours. :) I was and 
am happy to contribute to e4c, and indeed e4c served me very well for a long 
time. But I had in my mind a concept of how to support OpenMP that would 
require major refactoring of e4c, and indeed change its usage too much...

I see no problem with multiple solutions existing in the world of free and open 
source software! Indeed, it's why I much prefer C to C++: I get to decide how 
to implement the features I want. C++ does a lot of stuff for us, but not 
always in the way that we want.

Original comment by tal.liron on 8 Jun 2014 at 2:45

GoogleCodeExporter commented 9 years ago

Sure, I have no problem whatsoever with other libraries :)

I'm glad if mine served and somewhat inspired you to create your own. I see 
your point and what "Exceptional C Exceptions" aims for. It has specific goals, 
and they are perfectly valid.

I'm looking forward to our mutual collaboration and I also would like to thank 
you for helping me improve exceptions4c.

Original comment by guillermocalvo on 8 Jun 2014 at 6:20

dotdotpan / exceptions4c

e4c does not support OpenMP #19