Implement Inference Caching in the Backend Instead of Frontend

Issue description

The goal is to enable caching capability in the backend system to store the outcome of an inference request made on an image. In case the users opt for different models for the same image, the inference request must be executed just once. Switching to a model that has already produced an inference on a chosen image should return the outcome that has been saved in the cache.

Steps to Reproduce

Send a request to the /inf route two time
The first time, the system returns the inference result from the pipeline
The second time, the system returns the result stored in the cache

Expected Behavior

If the cache does not contain any results for the chosen image and pipeline, the system should invoke the necessary model(s) to produce the inference result. On the other hand, if the system finds a matching image in the cache that corresponds to the pipeline inference result, it should simply return the cached result.

Current Behavior

The model(s) are called at every inference request since the inference result is not cached.

Possible Solution

This issue is referencing this comment:

I'm not exactly sure how it works on the front end, but there is already a caching functionality that keeps track of the result.

scores: [],
classifications: [],
boxes: [],
annotated: false,
imageDims: [],
overlapping: [],
overlappingIndices: [],

As of now, inferencing would overwrite what is in there. Maybe we could add another parameter that would contain the model that displays the score, classification, boxes, and everything else related?

 model:{
   scores: [],
   ...
}

function loadResultToCache

For the backend

To implement the same idea in the backend, we would need to keep track of the image passing in the inference request, pair it with the pipelines, and keep the result when returned. Then if the image is call again, we send back the result from cache.

---
title: Test
---
flowchart TD

image[Send image to inference request]
check(back end check if image already exists <br> in cache and if pipelines name is the same)
inference[image not in cache <br> or pipeline not call]
cache[image in cache <br> and pipeline call]
inf_res[send back result from inference]
cache_res[send back result stored in cache]

image -- from Front end--> check
check --> inference & cache
inference-->inf_res
cache-->cache_res

Originally posted by @MaxenceGui in https://github.com/ai-cfia/nachet-frontend/issues/96#issuecomment-1933001080

Additional Context

There is already a caching functionality in the frontend to display the result

Tasks

[ ] Implement a caching functionality to store the inference result
[ ] Implement a checking in inference_request to return the cache result if True

ai-cfia / nachet-backend