Closed sslattery closed 4 years ago
@junghans would appreciate your input on this as I probably messed stuff up. I know the install is not currently working.
@sslattery you don't want to merge the caijta history?
@sslattery you don't want to merge the caijta history?
I looked into the merge unrelated history option. It seemed to me that it would try to merge both together at the top level directory which would create more of a mess than I wanted with the build. Is this true or can I use that option to merge it in as a subdirectory?
I'm not too attached to the idea of keeping around that old history.
@sslattery you don't want to merge the caijta history?
I looked into the merge unrelated history option. It seemed to me that it would try to merge both together at the top level directory which would create more of a mess than I wanted with the build. Is this true or can I use that option to merge it in as a subdirectory?
I'm not too attached to the idea of keeping around that old history.
Yes, you will have to merge the cajita source into a subdirectory before the merge.
With the one config change and adding back ArborX on jenkins everything built and passed tests.
I forgot we had pulled ArborX from the Jenkins. I'm seeing example build errors on Travis likely due to the bad install. Did you resolve those as well?
@sslattery now it has all the cajita history with 2c1e6b9 and 0b974e1 being identical contentwise.
@junghans awesome work - now looks like I have some tests to clean up
The CUDA build is passing and I have recreated the HIP errors on Jenkins in docker so working on that now. No progress on the stack smashing as my valgrind came back clean.
The CUDA build is passing and I have recreated the HIP errors on Jenkins in docker so working on that now. No progress on the stack smashing as my valgrind came back clean.
For the stack smashing stuff, I'm pretty sure GCC has flags to try and help you find it. Some documentation here: https://wiki.osdev.org/Stack_Smashing_Protector
Basically:
-fstack-protector: Check for stack smashing in functions with vulnerable objects. This includes functions with buffers larger than 8 bytes or calls to alloca.
-fstack-protector-strong: Like -fstack-protector, but also includes functions with local arrays or references to local frame addresses.
-fstack-protector-all: Check for stack smashing in every function.
Some operating systems have extended their compiler with more relevant options:
-fstack-shuffle: (Found in OpenBSD) Randomize the order of stack variables at compile time. This helps find bugs.
There's also apparently a tool called Mudflap, that can be used to find some stack smashing stuff: "adds runtime error checking for pointers that are typically the cause for many programming errors" (http://www.qnx.com/developers/docs/6.5.0/index.jsp?topic=%2Fcom.qnx.doc.ide.userguide%2Ftopic%2Fdebug_UsingMudflapInIDE_.html)
I'll see if i can re-create this locally with the flags, and let you know
The CUDA build is passing and I have recreated the HIP errors on Jenkins in docker so working on that now. No progress on the stack smashing as my valgrind came back clean.
For the stack smashing stuff, I'm pretty sure GCC has flags to try and help you find it. Some documentation here: https://wiki.osdev.org/Stack_Smashing_Protector
Basically:
-fstack-protector: Check for stack smashing in functions with vulnerable objects. This includes functions with buffers larger than 8 bytes or calls to alloca. -fstack-protector-strong: Like -fstack-protector, but also includes functions with local arrays or references to local frame addresses. -fstack-protector-all: Check for stack smashing in every function. Some operating systems have extended their compiler with more relevant options: -fstack-shuffle: (Found in OpenBSD) Randomize the order of stack variables at compile time. This helps find bugs.
There's also apparently a tool called Mudflap, that can be used to find some stack smashing stuff: "adds runtime error checking for pointers that are typically the cause for many programming errors" (http://www.qnx.com/developers/docs/6.5.0/index.jsp?topic=%2Fcom.qnx.doc.ide.userguide%2Ftopic%2Fdebug_UsingMudflapInIDE_.html)
I'll see if i can re-create this locally with the flags, and let you know
I'm having a hard time recreating the stack smashing either locally on our cluster. I guess the best method forward is to either change travis versions or add flags to the build there and hope...
OK HIP build is working. @dalg24 what do you make of the Jenkins CUDA errors? They have no real info that I can discern and I did not get those errors when I built with the CUDA docker image on my machine.
retest this please
OK now BovWriter
test is failing on CUDA which leads me to believe there is a problem with that test.
OK now
BovWriter
test is failing on CUDA which leads me to believe there is a problem with that test.
Was not able to reproduce this error on CADES-condo with P100s. We might need to see if we can get on the test system to check. I did add a fence but that wouldn't explain the OpenMP issue.
OK now
BovWriter
test is failing on CUDA which leads me to believe there is a problem with that test.Was not able to reproduce this error on CADES-condo with P100s. We might need to see if we can get on the test system to check. I did add a fence but that wouldn't explain the OpenMP issue.
Now getting another CUDA failure on Jenkins - seems random to me
I tempted to give up on the BovWriter, make it experimental, and disable the test or something
I would support making BovWriter experimental. I have tried to reproduce the error and could not.
I think setting this aside is wise - don't want it to hold this PR up. (PS - having trouble calling in to the meeting so contributing any comments where I can)
Adds the Cajita source and attempts a preliminary unification of the build. Cajita now sits in a separate package in the
cajita/
directory. Everything sits in its own namespace for now - we can clean things up once the initial merge is complete.One of the preliminary unification mechanisms is the handling of dependencies. Dependencies are now automatically checked for but a user can require them if they desire. We had discussed doing it this way before but if there are objections we should discuss. I could maybe see a situation where a dependency was in a system path but you didn't want to build against it because it didn't work or something? Perhaps the workaround here is an explicit enable/disable for dependencies.