Open srirajpaul opened 5 years ago
@srirajpaul some preliminary bullet points on this:
thread is actually a GCC keyword (not something in the C specification). While our only other explicitly supported compiler is clang, a quick google of "clang thread support" suggests it isn't really complete: https://www.google.com/search?safe=strict&ei=X4lhXMObH4vSvgTWi6Mo&q=clang+__thread+support&oq=clang+__thread+support&gs_l=psy-ab.3..35i39j0i22i30l2.262.890..990...0.0..0.295.508.2-2......0....1..gws-wiz.......0i71.BhbzlL46ZFY. It looks like at least some version of icc supports __thread as well though. Still, don't necessarily love the idea of adding in a non-spec part of the language to the code base, even one that is now semi-spec given it's supported by multiple major compilers.
There is a thread-local extension in the C11 spec, but we don't require C11 support today (just C++11). https://en.cppreference.com/w/c/thread/thread_local
So, we'd need to choose between relying on __thread and possibly losing some compiler flexibility, or going with the C11 standard and possibly losing some older compilers.
@agrippa C11 support is not needed today, but the Makefile adds the c11 flag while compiling HClib. Therefore using _Thread_local storage class would not cause any disruption for whoever is using HClib. Since __thread is not a part of C standard, we can ignore it.
Good point about c11, I hadn't realized that. I've added some text to the README explicitly calling std=c11 out as a requirement, and submitted a pull request. Could you take a quick look? It's a trivial change.
So... are you volunteering to make this change? :)
I will make this change.
boost context routines don't guarantee the correctness of thread local variables. They mentioned in their document about this issue. When I tested for my previous work using boost context, thread local is OK in Linux but not on MacOS. As we can see, context switching routines are written in assembly for each OS and architecture. We can choose to use thread local for x86 / linux and pthread routines for others.
Seonmyeong,
You make a good point. On a related note, I think it's fine if we limit the number of platforms supported by HClib to a smaller number that we are confident of supporting well. While MacOS support is a convenience, it need not be a requirement. This issue has also come up when trying to make HClib builds cmake-compatible.
Best,
Vivek
Vivek Sarkar, http://vsarkar.cc.gatech.edu Professor, School of Computer Science, Klaus 2332, +1-404-385-5989 Stephen Fleming Chair for Telecommunications, College of Computing, Georgia Institute of Technology Co-Director, Center for Research into Novel Computing Hierarchies (CRNCH) Assistant: Kenya Payton, kenya.payton@cc.gatech.edumailto:kenya.payton@cc.gatech.edu, Klaus 2115, +1-404-385-6440
On Jun 5, 2019, at 1:23 AM, Seonmyeong Bak notifications@github.com<mailto:notifications@github.com> wrote:
boost context routines don't guarantee the correctness of thread local variables. They mentioned in their document about this issue. When I tested for my previous work using boost context, thread local is OK in Linux but not on MacOS. As we can see, context switching routines are written in assembly for each OS and architecture. We can choose to use thread local for x86 / linux and pthread routines for others.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/habanero-rice/hclib/issues/70?email_source=notifications&email_token=AAP3LYFSVN2OZPV72LQUIATPY5EUZA5CNFSM4GWN7UDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW6UHKQ#issuecomment-498942890, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAP3LYGQSDWANIZVI7JDBH3PY5EUZANCNFSM4GWN7UDA.
@sbak5 That is a very good point. We will be careful during our implementation to make sure thread local variables/pthread thread specific data not lost during context switch/and back. Currently we save the required data from thread specific storage before boost context switch and restore them when it switches back. So if the concern with boost context is that the continuation might run on a different thread/worker and hence data stored in thread_local is not valid, that problem won't happen. Still we will keep both thread_local and pthread apis and allow to switch between them using preprocessor macros.
@srirajpaul Currently, I think HClib only supports task-level parallelism, where all the closures are executed once as OpenMP tasks. So, it's fine currently because the tasks doesn't keep its status and context when they're switched back. When we extend HClib to SPMD and user-level threads which have stack and do context switching as in pthread, this can be an issue. When M:N mapping is done (mapping ULT onto kernel level thread workers), User-level threads behave as kernel level threads while it is running on the certain kernel level thread. When they're switched back, all the values in the thread local variables should be also switched back.
I don't know the exact reason but when ULTs using boost context are switched onto kernel level threads, thread local variable values are not maintained even when they're managed explicitly. (weird garbage value is stored after some number of context switching happens.)
@vsarkar I'm not sure if it works on IBM Power with thread local variables. I guess it's better to use some wrappers where either pthread_set/getspecific and thread local variables are used depending on the underlying OS and architecture.
pthread_get/getspecific is used to save and retrieve the state of the worker. We can use thread storage class specifier to do the same which is supported by all the major compilers. Also, thread is much more efficiently implemented by the compiler.