robur-coop / builder

Scheduling build jobs on regular intervals, collecting artifacts
ISC License
13 stars 1 forks source link

builder-server: ECONNRESET after job finished #23

Closed reynir closed 12 months ago

reynir commented 2 years ago

Looking at the timings of the log messages it might have happened during upload.

Nov 04 11:01:53 spurv builder-server[84472]: job d9b80f9c-3fab-4f0b-b701-136aed596630 scheduled name orb-debian-11, opam orb for 10.0.11.128:41884
Nov 04 11:10:06 spurv builder-server[84472]: job d9b80f9c-3fab-4f0b-b701-136aed596630 finished with exited 0
Nov 04 11:10:06 spurv builder-server[84472]: Fatal error: exception Unix.Unix_error(Unix.ECONNRESET, "read", "")
hannesm commented 2 years ago

my suspicion would be that the server intended to schedule a new job to another worker, and failed since the worker disappeared in the meantime. https://github.com/roburio/builder/commit/b969d9f617e1e47713c452d4372922e30c0a8463 reorders the log messages and prefixes the uuid always for better investigations.

hannesm commented 12 months ago

since we couldn't reproduce / figure out in the last 2 years, let's close and re-open with further log information when it happens again.