ai-cfia / nachet-backend

A flask-based backend for Nachet to handle Azure endpoint and Azure storage API requests from the frontend.
MIT License
1 stars 3 forks source link

Implement Inference Caching in the Backend Instead of Frontend #56

Open MaxenceGui opened 5 months ago

MaxenceGui commented 5 months ago

Issue description

The goal is to enable caching capability in the backend system to store the outcome of an inference request made on an image. In case the users opt for different models for the same image, the inference request must be executed just once. Switching to a model that has already produced an inference on a chosen image should return the outcome that has been saved in the cache.

Steps to Reproduce

  1. Send a request to the /inf route two time
  2. The first time, the system returns the inference result from the pipeline
  3. The second time, the system returns the result stored in the cache

Expected Behavior

If the cache does not contain any results for the chosen image and pipeline, the system should invoke the necessary model(s) to produce the inference result. On the other hand, if the system finds a matching image in the cache that corresponds to the pipeline inference result, it should simply return the cached result.

Current Behavior

The model(s) are called at every inference request since the inference result is not cached.

Possible Solution

This issue is referencing this comment:

I'm not exactly sure how it works on the front end, but there is already a caching functionality that keeps track of the result.

scores: [],
classifications: [],
boxes: [],
annotated: false,
imageDims: [],
overlapping: [],
overlappingIndices: [],

As of now, inferencing would overwrite what is in there. Maybe we could add another parameter that would contain the model that displays the score, classification, boxes, and everything else related?

 model:{
   scores: [],
   ...
}

function loadResultToCache

For the backend

To implement the same idea in the backend, we would need to keep track of the image passing in the inference request, pair it with the pipelines, and keep the result when returned. Then if the image is call again, we send back the result from cache.

---
title: Test
---
flowchart TD

image[Send image to inference request]
check(back end check if image already exists <br> in cache and if pipelines name is the same)
inference[image not in cache <br> or pipeline not call]
cache[image in cache <br> and pipeline call]
inf_res[send back result from inference]
cache_res[send back result stored in cache]

image -- from Front end--> check
check --> inference & cache
inference-->inf_res
cache-->cache_res

Originally posted by @MaxenceGui in https://github.com/ai-cfia/nachet-frontend/issues/96#issuecomment-1933001080

Additional Context

Tasks

rngadam commented 5 months ago

There should be some discussion in th spec about the pipeline and the underlying models version. if a model version in the pipeline is updated, it should invalidate previous call. Also perhaps a mechanism to invalidate cache (from the frontend?).

https://martinfowler.com/bliki/TwoHardThings.html