status-im / nimbus-eth2

Nim implementation of the Ethereum Beacon Chain
https://nimbus.guide
Other
523 stars 227 forks source link

weird macos glitch #6387

Closed tersec closed 3 months ago

tersec commented 3 months ago
[2024-06-24T12:17:54.155Z] Beacon state [Preset: mainnet] ....... (0.23s)
[2024-06-24T12:17:54.155Z] state diff tests [Preset: mainnet] . (0.74s)
[2024-06-24T12:17:54.155Z] Sync committee pool ....... (0.00s)
[2024-06-24T12:17:54.155Z] SyncManager test suite ........................ (1.87s)
[2024-06-24T12:17:54.155Z] Blinded block conversions .... (0.01s)
[2024-06-24T12:17:56.030Z] Validator change pool testing suite ...... (2.94s)
[2024-06-24T12:17:56.030Z] Validator pool ..bash: line 1: 67986 Trace/BPT trap: 5       NIMBUS_TEST_KEYMANAGER_BASE_PORT=$(( 9960 + EXECUTOR_NUMBER * 6 + 0 )) NIMBUS_TEST_SIGNING_NODE_BASE_PORT=$(( 9960 + EXECUTOR_NUMBER * 6 + 4 )) build/${TEST_BINARY} ${PARAMS}
[2024-06-24T12:17:56.030Z] 
[2024-06-24T12:17:56.030Z] all_tests --xml:build/all_tests.xml --console failed; Last 5000 lines from the log:
jakubgs commented 3 months ago

I have found out that the Clang versions on our MacOS hosts are not consistent:

 > ansible ci-slave-macos -o -a 'sudo -iu jenkins /opt/homebrew/opt/llvm/bin/clang --version | head -n1'
maci7-01.ms-eu-dublin.ci.devel | CHANGED | rc=0 | (stdout) Homebrew clang version 18.1.4
maci7-02.ms-eu-dublin.ci.devel | CHANGED | rc=0 | (stdout) Homebrew clang version 18.1.4
maci7-03.ms-eu-dublin.ci.devel | CHANGED | rc=0 | (stdout) Homebrew clang version 17.0.6
macm1-01.ih-eu-mda1.ci.devel | CHANGED | rc=0 | (stdout) Homebrew clang version 17.0.6
macm2-01.ih-eu-mda1.ci.devel | CHANGED | rc=0 | (stdout) Homebrew clang version 18.1.4
macm2-01.ih-eu-mda1.ci.release | CHANGED | rc=0 | (stdout) Homebrew clang version 18.1.4
macm2-02.ih-eu-mda1.ci.devel | CHANGED | rc=0 | (stdout) Homebrew clang version 17.0.6

Even though we do pin it as llvm@18 in our Ansible config: https://github.com/status-im/infra-ci/blob/fc984e12/ansible/group_vars/ci-slave-macos.yml#L32

I have fixed it so it should be correct now. But maybe we should enforce some kind of check on LLVM version?

Or simply use Nix shell to pin it more effectively.

jakubgs commented 3 months ago

Appears to work with Clang 18:

image

tersec commented 3 months ago

Appears to work with Clang 18:

image

oh if it works with clang 18 just make them all clang 18 and we can be done with it.

don't really want to spend time hunting more clang bugs if they're already fixed in current version

tersec commented 3 months ago

clang 18 does appear to address this, not pursuing any more