biigle / maia

:m: BIIGLE module for the Machine Learning Assisted Image Annotation method
GNU General Public License v3.0
2 stars 3 forks source link

Generate feature vectors without delay on busy queues #130

Closed mzur closed 11 months ago

mzur commented 11 months ago

Currently, the "generate feature vector" jobs are submitted independently of and after novelty/object detection. If only a single GPU queue is available this could mean that another MAIA job could squeeze inbetween and the feature vectors are only generated hours or days later.

Instead, the feature vectors should be generated immediately after the detection finished, no matter if another job was submitted in the meantime. This would also speed up the process because all the images are still cached.

This can probably be implemented with Laravel's job chains.

mzur commented 11 months ago

Here is the plan: The current implementation of "request" and "response" jobs is outdated. Generating the training proposals or annotation candidates, which is currently done in the "response" job can also be done in the "request" job (which can just be called novelty detection or object detection job then).

With a single "detect" job that creates the database models, a job chain can be implemented that:

If the generate feature vector job is submitted in the jain on the same queue (connection) then it should run before any subsequent jobs on the same queue (connection). The whole job chain probably has to be submitted to the same (GPU) queue although the "prepare" jobs don't need a GPU but I can accept that. (Edit: The docs say that different queues are possible by calling onQueue() and onConnection() in the constructor.)

This can probably replace the workaround with the MaiaJobContinued event that is currently used.

mzur commented 11 months ago

While job chaining was implemented in #136, it does not have the desired effect. Jobs in the chain are newly submitted after the previous job finished (I though all were submitted at the same time and "waiting"). This still leaves the opportunity for other jobs to "squeeze in" in the meantime. Reopening to find another solution.

mzur commented 11 months ago

Two ideas:

  1. Support different queues (gpu-high, gpu) and push the generate feature vector job to the gpu-high queue which is processed first. Pros: Jobs with different concerns are still separated. Cons: Needs correct configuration of the instance to work correctly, images might have to be cached again.

  2. Generate the feature vectors directly in the "detect" job. Pros: Cached images are reused, no extra configuration required. Cons: The detect jobs get really bloated.

mzur commented 11 months ago

I think I prefer idea 1. The default queue names could be configured to work out of the box (i.e. the feature vector jobs use the same queue than the detect jobs, maybe resulting in this issue). If instance admins want to tweak this, they can configure their queue workers and the queue for the feature vector jobs themselves.