deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.07k stars 650 forks source link

The unzipped cache file of the criteria loading model is not verified. (criteria 加载模型的解压缓存文件未进行校验) #1746

Closed fierceX closed 1 year ago

fierceX commented 2 years ago

Description

使用criteria加载模型,会在~/.djl.ai/cache/目录下解压缓存模型文件,但是该解压的文件夹名称未进行校验,导致模型修改后,还会加载已经缓存的模型 Using criteria to load the model, the cached model file will be unzipped in the ~ / .djl.ai / cache/ directory, but the name of the extracted folder has not been verified, so that after the model is modified, the cached model will be loaded

frankfliu commented 2 years ago

The cached model is based on model URL. We don't have a good way to verify if the content of the URL has been changed or not. We might be able to validate cache for some of URL, not there isn't an general way to check content changes.

A few proposals to handle this issue, but most likely the burden still on developer side.

  1. Add a ContentChangeDetector interface that allows developer to implement their own way to notify DJL the model has changed
  2. Developer implement a background thread to periodically check the model url and remove the cache folder if the model changed.
  3. If user want to always download new model when loading the model, user can add an random value parameter in the URL: like "https://resource.djl.ai/resnet.zip?ran=XXXX".
  4. Use "E-Tag" header if validate if content changed, but this only work for if server support that.
frankfliu commented 1 year ago

We implemented option 4. Using "E-Tag" header ti detect if the model has been changed or not.