phetsims / perennial

Maintenance tools that won't change with different versions of chipper checked out
MIT License
3 stars 5 forks source link

Persist Queue after restarts #303

Closed mattpen closed 1 year ago

mattpen commented 1 year ago

Occasionally, the build-server process will crash and restart. When this happens, the current build queue is lost. This is very disruptive for maintenace releases. We should find a way to persist the queue so it can resume builds after recovering from a fatal error.

samreid commented 1 year ago

These commits have lint errors that are interfering with https://github.com/phetsims/beers-law-lab/issues/300. We recommend enabling precommit hooks to prevent pushing inadvertent lint errors that disrupt other work-in-progress.

mattpen commented 1 year ago

I added an improvement to the build server that maintains a queue in storage. Any time a task is enqueued or dequeued, it will update the file perennial/.build-server-queue with that change. On start up, it checks that file and enqueues its entire contents.

This queue will not restart or retry builds that fail and do not complete, so those will need to be rerun manually. It seems like it would be pretty futile to just retry a failed build and most of the time some developer intervention is required to correct a problem before a build will succeed.

I tested this change by deploying several builds of chains to ox-dev, then interrupting the build-server and restarting it. It seems to be working well in that use case so I deployed it to production today.