stackhpc / ansible-role-openhpc

Ansible role for OpenHPC
Apache License 2.0
45 stars 15 forks source link

CI Workaround: Failed to parse bus message: Invalid argument #110

Closed jovial closed 3 years ago

jovial commented 3 years ago

For rationale, please see comment in change.

jovial commented 3 years ago

Clarification: It is the systemd in the containers that is too old to handle the capabilities present in the host kernel.

jovial commented 3 years ago

CI failure was: fatal: [testohpc-grp2-0]: FAILED! => {"changed": false, "msg": "Failure downloading http://repos.openhpc.community/OpenHPC/2/CentOS_8/x86_64/ohpc-release-2-1.el8.x86_64.rpm, Request failed: <urlopen error timed out>"} re-running jobs.

jovial commented 3 years ago

Hmm, hit pip download issues this time:

pip._vendor.urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))

and a genuine failure on test12:

  TASK [assert] ******************************************************************
  fatal: [testohpc-login-0]: FAILED! => {
      "assertion": "(jobid + '|0|wrap|compute|2|testohpc-compute-[0-1]|COMPLETED') in sacct.stdout",
      "changed": false,
      "evaluated_to": false,
      "msg": "Didn't find expected output for 2 in sacct output: "
  }
sjpb commented 3 years ago

Weird test12 has passed now then?

jovial commented 3 years ago

Weird test12 has passed now then?

Yep, I guess we just got unlucky on that run!