robur-coop / builder

Scheduling build jobs on regular intervals, collecting artifacts
ISC License
13 stars 1 forks source link

Drop platform: advice to shutdown workers #39

Closed reynir closed 9 months ago

reynir commented 10 months ago

When dropping a platform it is best to first disable the relevant workers. Otherwise the workers will eventually recreate the platform.

Fixes #38

hannesm commented 9 months ago

So, what are the overall expectations?

But, there's no persistence of dropped platforms (should there be some)? So, a restart of the builder-server and a connecting builder-worker will execute such jobs again -- is this fine? do we want to persist the list of dropped platforms? (I think I'm in favour thereof). Also, I'd like the find_job to figure that the platform is dropped (and close the worker connection)..

reynir commented 9 months ago

it's not clear to me what a good way forward is. It is not nice to leave workers hanging. Closing the connection or terminating the worker is nice except we usually put our workers in an infinite loop involving installing a lot of packages; so that is not desirable.

Maybe persisting dropped platforms and refusing to drop a platform we have workers connecting thereby forcing a flow where the workers are first shut down. Then we still have the corner case where a worker is "rebooting" while the platform is dropped. Hmm.

hannesm commented 9 months ago

Oh, another path is server-only: persist the dropped platforms, and just don't ever give such a worker a job. Thus the worker will be there and alive, but not receiving anything. IMHO that would be fine.

reynir commented 9 months ago

I added a commit to persist dropped platforms. Then I realized we might want to be able to "undrop" a platform. I have not yet added the command in the client, and I need to bump a version number somewhere. I think the logic is not so easy to follow now.

hannesm commented 9 months ago

Which logic is not so easy to follow? The list of dropped platforms should also be inspectable (i.e. printed in the info output or such).

hannesm commented 9 months ago

fine to merge when CI is happy