ocurrent / solver-service

An OCluster service for solving opam dependencies
Apache License 2.0
12 stars 7 forks source link

Pipeline stress auto cancel #52

Closed moyodiallo closed 11 months ago

moyodiallo commented 1 year ago

This PR is a stress test which reproduce the bug about Auto-cancelling. Pushing us to revert it again on OCaml-ci https://github.com/ocurrent/ocaml-ci/pull/768.

The bug is triggered when there's 2 or + consecutive new commits that pushed in https://github.com/ocaml/opam-repository which is separate by less than 5 min or around. The second push cause auto-cancelling in the CI and all the old request related to opam-repository are cancelled, but the solver-service doesn't cancel a job itself imediatly(olny notify a cancel at the worker level connected to the scheduler) which is supposed to happen with the current design of ocluster(submission) and that result with lot of jobs accumulated in the solver-service. The old jobs is taking time to finish and the new requests are waiting.

moyodiallo commented 1 year ago

This is a draft and not clean.

moyodiallo commented 1 year ago

The PR https://github.com/ocurrent/solver-service/pull/41 is no longer needed. The same thing is done here.

tmcgilchrist commented 1 year ago

Could you add some instructions on how to run this into the README.md file and how it checks for errors? This looks like a good addition for a longer running stress test.