Closed dongahn closed 6 years ago
This might also be due to the local root stack smaching security patch on position independent executable.
https://www.google.com/amp/s/www.theregister.co.uk/AMP/2017/09/28/linux_kernel_vuln/
It seems this CI error is something different. Could @simoatze or @jprotze take a quick look at it?
I am still chasing the problem at LLNL caused by this security patch.
@dongahn @jprotze Doing a "sudo pip install --upgrade pip" solves the problem but the compilation of archer fails because some definition have changed, such as "ompt_task_undeferred". @jprotze I noticed you did the last commit on archer on the master branch, did you mean to do it on the "towards_tr4" branch? Looks like that's the reason why is failing now. Let me know, I can move it to the towards_tr4 branch and maybe create a stable branch and start merging towards_tr4 with the master.
FYI -- it turned out the failures that LC TOSS systems has been caused by a security risk mitigation measure applied before the actual security patch went it. We applied:
vm.legacy_va_layout=1
Now we have the security patch applied, we will revert this to vm.legacy_va_layout=0
and Archer/Tsan should work again.
@jprotze: you mentioned you saw a similar problem on a CEA machine. Could you try
sysctl --all | grep legacy_va_layout
See what value is set for the vm.legacy_va_layout
parameter?
We had a little bit of a messy situation. The master wasn't working anymore because it had changes related to the current version of the OpenMP runtime. So I fixed all the branch and created a "stable-tr4" branch which works with the "stable-tr4" branch of the OpenMP runtime, this is also the one related to the release. Now we have the branches "last_patch" that includes the last Joachim commit and the fix for Travis CI, when I run the tests it fails about 12 of them mostly related to OMPT, which means that we probably have to update some definitions to the current version of the OpenMP runtime. I'll fix those and merge everything on master, so we'll have:
I'll also update the README to reflect all the changes. Once I am done with this I'll do the tests on rzmanta again to see what's still wrong with the LOMP runtime.
Great. Thank you so much @simoatze!
My account was closed in the meantime, but I asked for the value and indeed:
vm.legacy_va_layout = 1
I am a little bit confused. The OMPT version in LOMP is completely different than LLVM OpenMP. The definition in the ompt.h don't match so I can't make archer compile. Also in the last two versions (Sep and Oct) LOMP hasn't changed. Both our "stable-tr4" and the "towards_tr4" branches are based on the LLVM OpenMP runtime, so what's the plan of IBM, when are they gonna catch up? Otherwise we need to keep two version of archer one for LLVM and one for LOMP.
@simoatze: anyway we can get "green" for the Travis sooner rather than later?
@dongahn should be green now, I forgot to update the travis.xml file with the upstream OpenMP runtime.
Thanks @simoatze! This is really helpful as we will soon cover PRUNERS in various venues.
Just noticed that the logo of our Travis CI test show an error. Didn't have the time to look at this with any details.