Closed hppritcha closed 2 weeks ago
@dalcinl here you go!
hmm, something is borked about configuring prrte for some of the jenkins tests
configure:5174: *** Configuring PRRTE
configure:63521: checking if PMIx version is 4.0.0 or greater
configure:63538: gcc -c -O3 -DNDEBUG -Wundef -Wno-long-long -Wsign-compare -Wmissing-prototypes -Wstrict-prototypes -Wcomment -Wshadow -Werror-implicit-function-declaration -fno-strict-aliasing -pedantic -Wall -Wformat-truncation=0 -finline-functions -mcx16 -I/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/3rd-party/openpmix/include -I/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/3rd-party/openpmix/include -I/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/3rd-party/openpmix/ -I/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/3rd-party/openpmix/ conftest.c >&5
conftest.c:526:1: warning: function declaration isn't a prototype [-Wstrict-prototypes]
main ()
^~~~
configure:63538: $? = 0
configure:63539: result: yes
configure:63614: ===== configuring 3rd-party/prrte =====
configure:63804: running /bin/sh ./configure --disable-option-checking '--prefix=/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/install' --enable-prte-ft --with-proxy-version-string=5.1.0a1 --with-proxy-package-name="Open MPI" --with-proxy-bugreport="https://www.open-mpi.org/community/help/" --disable-devel-check --enable-prte-prefix-by-default --disable-pmix-lib-checks --with-pmix-extra-libs="/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/3rd-party/openpmix/src/libpmix.la" 'CPPFLAGS= -I/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/3rd-party/openpmix/include -I/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/3rd-party/openpmix/include -I/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/3rd-party/openpmix/ -I/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/3rd-party/openpmix/' --cache-file=/dev/null --srcdir=.
configure:63824: ===== done with 3rd-party/prrte configure =====
configure:63847: error: PRRTE configuration failed. Cannot continue.
FWIW: on my PR, it kept complaining about not finding a valid PMIx build. Seemed like some issue with bringing down the PMIx submodule.
need to figure out what got borked in our prrte fork (we are careful about taking upstream commits but maybe not encough?) before advancing the sha @dalcinl
@hppritcha I don't believe that is the problem, though I could be wrong. When I tried to check OMPI main against head of upstream master branches, the problem I hit (which looked like the one you have here) was that Amazon kept failing to build the PR because PRRTE couldn't find a valid PMIx installation. Never was able to trace down a reason - looked/felt like Amazon simply couldn't download the PMIx submodule, but I'm not clear as to why that wouldn't have aborted the CI right then. Note that all the other CIs have no problem building it, so it is something unique about the Amazon Jenkins one.
Not sure of the reason - and I'm tied up for the next week. Just noting that it may not have anything to do with the PRRTE code.
okay now move to a suspicious (in terms on jenkins ci) sha
okay the problem is the hwloc jenkins CI is using is too old. configure message isn't very clear though. Looks like updating openpmix submodule may help with that.
okay the problem is the hwloc jenkins CI is using is too old. configure message isn't very clear though. Looks like updating openpmix submodule may help with that.
Per discussion with OMPI rms, we raised the minimum hwloc version to 2.1
Our configury isn't very small about failing if PMIx fails to configure, it just trundles on:
configure: WARNING: PMIx requires HWLOC v2.1.0 or above.
configure: error: Please select a supported version and configure again
configure: ===== done with 3rd-party/openpmix configure =====
checking for pmix pkg-config name... pmix
checking if pmix pkg-config module exists... no
checking for pmix wrapper compiler... pmixcc
checking if pmix wrapper compiler works... no
configure: Searching for pmix in default search paths
checking for pmix cppflags...
checking for pmix ldflags...
checking for pmix libs... -lpmix
checking for pmix static libs... -lpmix
checking pmix.h usability... no
checking pmix.h presence... no
checking for pmix.h... no
configure: error: Could not find viable pmix build.
+ echo './configure --prefix="/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/install" --disable-silent-rules failed, ABORTING !'
./configure --prefix="/home/ec2-user/workspace/pen-mpi.pull_request-v2_PR-12901/install" --disable-silent-rules failed, ABORTING !
+ test -f config.log
+ echo 'config.log content :'
config.log content :
Yeah that really confused me - had me chasing my tail 😗
I notice that the way the CI scripts work, if there's a configure failure at some point rather than just stopping the entire config.log is echo'd. This can make finding the actual configure failure a bit tricky to find in some cases.
@hppritcha I think what's confusing here is that OMPI's configure somehow continues on after the configure in PMIx generates an error due to seeing an HWLOC version that is below the minimum required. I'm not sure how/why the AC_MSG_ERROR
is failing to stop the entire process, yet somehow we continue and go on to the PRRTE configure code.
Looking at the autoconf documentation for that macro, I do see this caution:
The error-description should start with a lower-case letter, and “cannot” is preferred to “can't”.
which we violate on nearly all uses of that macro. It's the only AC_MSG_
macro with that caution - no idea why. Quick test shows that PMIx configure does correctly exit with a non-zero status when HWLOC is too old, so I'm not sure I understand the problem here. Might be worth someone exploring?
its a problem with the way the jenkins CI build script handles errors. Like I said above it just starts echoing all the logs rather than just exiting itself.
If I run by hand the behavior is as one would expect. configure dies with appropriate error message.
also advance pmix sha to 4aea550f6f55 to pick up PR https://github.com/openpmix/openpmix/pull/3414