Closed moyodiallo closed 11 months ago
This is a draft and not clean.
The PR https://github.com/ocurrent/solver-service/pull/41 is no longer needed. The same thing is done here.
Could you add some instructions on how to run this into the README.md file and how it checks for errors? This looks like a good addition for a longer running stress test.
This PR is a stress test which reproduce the bug about
Auto-cancelling
. Pushing us to revert it again on OCaml-ci https://github.com/ocurrent/ocaml-ci/pull/768.The bug is triggered when there's 2 or + consecutive new commits that pushed in https://github.com/ocaml/opam-repository which is separate by less than 5 min or around. The second push cause auto-cancelling in the CI and all the old request related to
opam-repository
are cancelled, but the solver-service doesn't cancel a job itself imediatly(olny notify a cancel at the worker level connected to the scheduler) which is supposed to happen with the current design of ocluster(submission) and that result with lot of jobs accumulated in the solver-service. The old jobs is taking time to finish and the new requests are waiting.