slight re-design/improvement

Actual limitations

Solving different list of opam repositories at the same time
Cancelling a job (the job remain until the end)
No caching system, the same request is always resolved again and again.

Improvement

graph LR
    A[A.Opam commits\nupdate\nlock]--->|Resquest| B[Cache of \nResponse]
    B[Cache of \nResponse] --->|Response| A[Opam commits\nupdate\nlock]

    C[Verify \nto \nresolve] ---> |Response|B[Cache \nof \nResponse]
    B[Cache \nof \nResponse] --->|Resquest| C[Verify \nto \nresolve]

    C[Verify \nto \nresolve] -->|Resquest| D[Distribute\nby\nplatform\nto worker-processes]
    C[Verify \nto \nresolve] --> GG[Get the names of\nthe opam urls packages]
    GG[Get the names of\nthe opam urls packages] --> D[Distribute\nby\nplatform\nto worker-processes]
    D[Distribute\nby\nplatform\nto\n workers] --> |Response|C[Verify \nto \nresolve]

    D --> H[wker1]
    D --> E[wker2]
    D --> F[wker3]
    D -->.......
    D --> G[wker n]

stages

The first stage use a lock to update all opam url commits in the request if there's a new hash commit.
The second stage, try to get the a result from the cache. Each new response of a request will be stored.
The third stage is important to improve the efficiency of the Solver-service. This stage is trying to verify if there's a change about the dependencies in the request that are related to the change in the opam urls commits. An example, if there is a new hash of opam-repository and the hash bring changes which are not related to the dependencies or request, so resolving is unnecessary for the request if there's already a response in the cache. It's remove, looking for the oldest commits after a resolve to prevent the cache of builds in the pipeline of a CI. To be little bit concrete here this command will be used git -C {clone-path of opam-repo} diff {old-hash} {new-hash} --- packages/{name} to know if the changes implies a package. To be more efficient, the stage can have caching also.
The last stage is to get all the opam urls packages, when the request is distributed to workers, those packages also are sent to the workers. The packages is sent by their opam urls hash commit in order to facilitate a caching. A mechanism could be added to manage the workers like when a job(request) is canceled, the process could be killed at the time.

cancelling a job

It's possible when distributing the workers to manage their state. A worker could be in different state:

Waiting (waiting for work)
Cancelled (when a cancelled job/request that gives some work to do, kills the process behind the worker)
Running

This is the plan :

[x] Solve the issue about cancelling job ( kill the internal-worker process as necessary).
- [x] Cancel a job when its switch is off. https://github.com/ocurrent/solver-service/pull/54
[ ] Different list of different types of opam-repository.
- [ ] Replace epoch-lock by opam-repository update lock
- [ ] Use a map for the different list of opam-repositories in the internal-workers
- [ ] Sqlite cache of opam-repositories could be used by the controller of the internal-workers
[ ] Cache and prevent resolving again.
- [ ] build or rebuild as needed.

EDITED: to add the plan.

ocurrent / solver-service

Discussion around the improvement of solver-service #53

slight re-design/improvement

Actual limitations

Improvement

stages

cancelling a job

This is the plan :