ocurrent / solver-service

An OCluster service for solving opam dependencies
Apache License 2.0
12 stars 7 forks source link

Add a cache system #75

Open moyodiallo opened 9 months ago

moyodiallo commented 9 months ago

The response of a request is cached and it is invalidated every time one of the packages of the new request is involve between the new opam commit and the old one (cached opam commit). Anytime the cached response is the same as the new response, we keep the old one for the oldest opam commit.

talex5 commented 9 months ago

This is too large for me to review (and seems to include a load of unnecessary reformatting).

Could you say something about how it works and why it's correct?

I would expect it to work like this:

I see all kinds of code here shelling out to git diff, git pull, git log, etc, which seems odd.

Note: Putting the cache on the worker means it might not work so well with a cluster, since it's less likely the worker handling a request also handled the previous one. An alternative would be to send all the hashes back to ocaml-ci and let it do the check in one place. However, that puts more load on the single CI instance, so it might not be a good trade-off.

moyodiallo commented 9 months ago

Could you say something about how it works and why it's correct?

  • For the first time when a request is solved, the result is cached with 2 keys. The hash of the request and hash of request without the opam repository commits (The first (url,commit) list and the second (url,_) list.
  • When a request is sent the solver-service. It is hashed and found, if not it goes to the next step.
  • The request hash (first key) wasn't in the cache, the request is hashed without the opam-repository commits (the second key), to find if there's an old solved for that request. If yes, it try to invalidate the cache with the new opam-repository commits (when the commits of the request and cache differ). If the cache is invalidated, it solves again. If the result is the same as old one, the old one is kept to conserve the old opam-repository commits.
  • Invalidating a cache is when the packages obtained with a diff of the new and old commit of an opam-repository, are involve in the request, and the transitive dependencies of that request are also involved.

With that we only re-solve a request if its packages are involve in the change of the opam-repository commits:

moyodiallo commented 9 months ago

Note: Putting the cache on the worker means it might not work so well with a cluster, since it's less likely the worker handling a request also handled the previous one. An alternative would be to send all the hashes back to ocaml-ci and let it do the check in one place. However, that puts more load on the single CI instance, so it might not be a good trade-off.

It could be interesting to have a node that distribute the requests among the solver-workers. Could we have that kind of config at the current structure (ocluster, ocluster-workers) ?

moyodiallo commented 9 months ago

solve-cache solve-cache(set/get) only record the results coming from the CI, it doesn't solve.

I think this could solve the cases :