Open CalvinDSeamons opened 4 years ago
Update: 0013259 appears to be the test that is hanging.
I found out that the hanging was caused by one of the test_run's configs keeping a lock file (there was a config.lockfile
in it's directory.)
But other than that all of the test_run statuses still end up in the BUILD_CREATED
state:
2020-10-21T11:21:24.852975 RESULTS Parsing 6 result types.
2020-10-21T11:21:24.853997 RESULTS Performing 0 result evaluations.
2020-10-21T11:21:26.182685 COMPLETE The test completed with result: PASS
2020-10-21T15:52:09.327888 BUILD_CREATED Builder created.
2020-10-21T15:58:54.338062 BUILD_CREATED Builder created.
This was another test I thought I'd throw into the issue:
2020-10-21T11:21:08.484894 CREATED Created status file.
2020-10-21T11:21:08.485955 CREATED Test directory and status file created.
2020-10-21T11:21:08.490230 BUILD_CREATED Builder created.
2020-10-21T11:21:08.493727 CREATED Test directory setup complete.
2020-10-21T11:21:16.778993 BUILD_REUSED Test 171aceb2e5e39623 run 13242 reusing build.
2020-10-21T11:21:23.182369 SCHEDULED Test slurm has job ID 3814891.
2020-10-21T11:47:35.252408 PREPPING_RUN Converting run template into run script.
2020-10-21T11:47:35.255769 RUNNING Starting the run script.
2020-10-21T11:47:35.261001 RUNNING Currently running.
2020-10-21T11:47:35.282176 RUN_DONE Test run has completed.
2020-10-21T11:47:35.289546 RESULTS Parsing 6 result types.
2020-10-21T11:47:35.292427 RESULTS Performing 0 result evaluations.
2020-10-21T11:47:35.308790 COMPLETE The test completed with result: PASS
2020-10-21T12:02:03.342975 BUILD_CREATED Builder created.
2020-10-21T15:32:14.386945 BUILD_CREATED Builder created.
2020-10-21T15:33:05.268400 BUILD_CREATED Builder created.
2020-10-21T15:34:35.869780 BUILD_CREATED Builder created.
2020-10-21T15:36:40.955206 BUILD_CREATED Builder created.
2020-10-21T15:37:49.399664 BUILD_CREATED Builder created.
2020-10-21T15:39:05.718075 BUILD_CREATED Builder created.
2020-10-21T15:40:00.666830 BUILD_CREATED Builder created.
2020-10-21T15:45:19.124162 BUILD_CREATED Builder created.
2020-10-21T15:48:21.269450 BUILD_CREATED Builder created.
2020-10-21T15:48:53.542587 BUILD_CREATED Builder created.
2020-10-21T15:49:45.782932 BUILD_CREATED Builder created.
2020-10-21T15:50:25.110590 BUILD_CREATED Builder created.
2020-10-21T15:52:09.117636 BUILD_CREATED Builder created.
I ran into a very strange bug while testing snow today. This happened on the Yellow front end where there are approximately 11,000 more tests sitting in the working_dir than the Turquoise, just wanted to mention that as that seems to be the only notable difference.
After launching my tests I ran
watch pav status
. Upon the loading of the status table (which only took 3-5 seconds) a few license-tests had already completed withPASS
and the rest where allSCHEDULED
, everything seemed fine. 20ish minutes later after everything had finished mywatch pav status
which updates every 3 seconds showed everything as fine,PASS
. I quit out (don't ask me why) and ran justpav status
as to copy the contents out into the ticket. The command hanged, as didpav result
or any permutation ofpav log build/run $series/$id
ect. I even logged into snow from a different terminal session, loaded pavilion/2.0 and could not access the test run. Upon using thecat
command i received the following status file from one of the tests that I had observed passing:The
PASS
is what I observed insidewatch pav status
. When I exitedwatch pav status
the test status changed toBUILD_CREATED
and was unreachable frompav status
.I thought I'd make a note of it as @kjeverson could also not access anything through
pav status
. I was able to fix this by usingscancel -u $user; pav cancel --all; module unload
and reran my test. To whomever wants to investigate this furthers377
still hangs when called and can be poked at in the yellow.