Closed jgfouca closed 2 years ago
Marking the triage for whats going on with lassen as completed since I think we have a pretty good idea of the issue now.
Re: gcc 8 toolchain. I ran test-all on PR #1260 (which exhibited an ICE during AT testing on weaver), and was able to reproduce the ICE. I then merged the gcc8 branch into it, and successfully built.
@jgfouca do you want to pull the trigger on switching to gcc8 on weaver? PR #1282 brought in a fix in Homme that was needed to avoid a separate (and much less likely to strike us again) ICE in Homme, so now we could do a PR with just machine_specs.py
and weaver.cmake
changes. I can package the PR in 5 minutes.
@bartgol yes, please upgrade weaver to gcc8
All tasks complete. Testing is in much better shape (before mappy maintenance happened). Thanks everyone!
The dashboard has looked pretty bad for a while. I'd like to make a task list to get us back on track. Please see the list below for the list of problems and checkboxes for potential solutions, I'd like volunteers for all checkboxes
valgrind test-launcher -e <exec>
totest-launcher -e valgrind <exec>
(before we were valgrind-ing test-launcher, rather than our execs). So It hink the current 90-100min build time are likely to stay. If that's too much, we can probably add a 'DEV' test profile, where we run just 1 time step (SHORT runs 2 time steps), which might cut our long tests runtime by 40-50%.Error: Remote JSM server is not responding on host lassen71010-25-2021 00:57:53:454 74949 main: Error initializing RM connection. Exiting.
There appear to also be issues withtaskset
,failed to set pid 0's affinity: Invalid argument