SCOREC / rhel7-dev

dotfiles, scripts, and documentation useful for developers using the newer RHEL7 systems at RPI
GNU General Public License v3.0
0 stars 0 forks source link

PETSC failure on some RHEL machines #3

Open jacobmerson opened 5 years ago

jacobmerson commented 5 years ago

Petsc VecAYPX is giving a segmentation fault on Jenga. The same code seems to work fine on the RHEL7 builds we have for Professor Picu's lab.

Note: I am using petsc/3.9.3-int32-real-c-ceyty2w

jacobmerson commented 5 years ago

@wrtobin mentioned that he was going to run the petsc tests as part of the spack build. It would be useful to know if those tests ended up passing.

Also as we discussed, currently petsc is compiling with dyamic libs. We may want to only have static libs for compatibility of our build process with BGQ.

wrtobin commented 5 years ago

The PETSc spack build currently does not complete when the tests are configured to run for some reason, possibly a bug of some sort. I may be able to look into it more later this week after the fusion meeting Thurs morning.

jacobmerson commented 5 years ago

It seems like the superlu config may be part of the issue.

wrtobin commented 5 years ago

Yeah this is still an issue. After looking more explicitly at this and making the 'make test' call myself on the spack-created configure/build directory for PETSc, all the tests are failing.

wrtobin commented 5 years ago

This is on Jenga since that is my office machine. I haven't looked on any other machines. Can you point me to a machine in Picu's lab that you've noted is working? @jacobmerson

wrtobin commented 5 years ago

Tagging @cwsmith here cause I also just emailed him about this issue.

jacobmerson commented 5 years ago

I had petsc working on backgammon and petsc, however as of this week I was having trouble compiling on backgammon due to limits.h missing...

wrtobin commented 5 years ago

Oh you compiled your own PETSc? Have you successfully used the spack-installed PETSc to link and execute any application binaries without encountering a memory violation error?

jacobmerson commented 5 years ago

I used to use my own version, but the spack version worked on those machines.

wrtobin commented 5 years ago

Okay I'll try executing the PETSc test suite on one of those

wrtobin commented 5 years ago

The spack testing still fails with the same issue, but explicitly making the test call succeeds on backgammon.

wrtobin commented 5 years ago

Soooooooooooo I'm confused.

wrtobin commented 5 years ago

Calling 'make test' on backgammon succeeds for all tests.

For an identical spack config on jenga 'make test' fails.

wrtobin commented 5 years ago

I've submitted a support ticket trying to describe this issue in detail to dotcio (aka Bob), because I'm currently at a loss for how to proceed and until this works I'm stuck on development of some things for m3dc1.

wrtobin commented 5 years ago

The default spack version of PETSc is now 3.10.2, and the installation of this version (on Jenga) appears to pass PETSc's testing suite during build.

@cwsmith Do we have a substantial number of people using the 3.9.3 version that you know of, and do you think switching our install to the 3.10.2 version and possibly removing the (seemingly unstable) 3.9.3 version might be the way to go?

cwsmith commented 5 years ago

@wrtobin AFAIK, the only petsc users were XGCm developers; feel free to replace the existing installs.

cwsmith commented 5 years ago

As I understand, part of the issue was incompatibility with the installed version of (par)metis. All the gcc 7.3.0 and mpich 3.1 spack packages were removed (for a different reason) and PETSc 3.11 was reinstalled prior to the other gcc 7.3.0 and mpich 3.3 spack packages.

Would you please rerun any of the problematic cases/tests?

jacobmerson commented 5 years ago

@wrtobin can you rerun the petsc test suite? I will rerun the original offending Biotissue code, but it might take me a couple days to get to it.