Baseflow / flutter_cache_manager

Generic cache manager for flutter
https://baseflow.com
MIT License
739 stars 426 forks source link

Fix race condition of expired cache files #342

Closed xoox closed 2 years ago

xoox commented 2 years ago

:sparkles: What kind of change does this PR introduce? (Bug fix, feature, docs update...)

A bug of downloading-accessing race condition was fixed.

Here CacheManager.getSingleFile() was refactored to clarify two situations.

:arrow_heading_down: What is the current behavior?

While cache expired, the original implementation returns older cache information without waiting for possible new-downloaded file or modification check. In situations where ETag has changed, updated file will be downloaded using a new file name and old cache file be deleted (according to WebHelper._manageResponse() and WebHelper._setDataFromHeaders()).

If old cache file is returned immediately as in the original implementation, in cases where the old cache file is still consuming, it'll be very likely deleted by web helper routine. Thus downloading-accessing race condition arise.

:new: What is the new behavior (if this is a feature change)?

:boom: Does this PR introduce a breaking change?

No.

:bug: Recommendations for testing

It's a little tricky to reproduce this race condition bug. Using a very short valid time image, try to repeatedly get the image and update the picture widget. Substitute the image with another one on server (Same URL but changed ETag then), the repeating routine in the App will fail with exception(can't read the file). The reason is that the newly downloaded file has different file name than the old one (which is deleted by cache manager).

:memo: Links to relevant issues/docs

:thinking: Checklist before submitting

renefloor commented 2 years ago

Good catch @xoox. However I'm not fully happy with the solution. I prefer the fact that a file is returned as soon as possible. When using the stream version you will first get the old one and then get the new one. In this case it is also a problem that the file is being deleted as the library user might still be doing something with that file.

I think we should not call the _removeOldFile in the web_helper, but we'll have to delete the file at some other point in the future. Maybe during the initialisation of the cache manager? In that case we are pretty sure the file is not in use.

xoox commented 2 years ago

The solution proposed by @renefloor is a good candidate.

In our project, only getSingleFile() is used to download files for native methods. The downloading and consuming processes were independent from each other. We applied a hot fix in our local fork and hope this PR might make some better idea sparked. So please feel free to add more commits towards better approach.

My first-coming long-term solution is not calling the _removeOldFile in the web_helper. The old file should be deleted according to stalePeriod by scanning the storage area (or with the help of a cache history table). While there are too many files in the storage, this approach might not good enough.

Besides the file scanning burden to delete stale files, this delete-old-file-far-latter approach has another drawback should be considered very carefully. A couple of CDN (perhaps most of) can't handle the If-None-Match header properly. These CDN nodes always respond with 200 rather than 304 code while the cached file actually not changed (identified by ETag). The improper 200 response will make the client download the same file but resulting in a new file name in its local storage, which will make the local cache size growing dramatically (if the old file is not deleted timely). For some aggressive configuration with stalePeriod set to one year or more, the pain is even bitterer.

renefloor commented 2 years ago

@xoox I've been thinking a bit about this, but I'd like to use your solution first before creating a better fix because that will take more time. Not giving a deleted file is more important than giving it as soon as possible.

Do you mind me merging the latest develop into your branch (and pushing it) so the new workflow checks can run?

xoox commented 2 years ago

@renefloor Please feel free to merge or rebase my branch to have a quick fix for this race condition. We're expecting solutions from upstream maintainers instead of our own temporary fix.

A better valuable solution is welcome. Thanks for your efforts.

xoox commented 2 years ago

I've just rebased it onto current develop branch. Feel free to go ahead please.

codecov[bot] commented 2 years ago

Codecov Report

Merging #342 (2e9604a) into develop (30de944) will decrease coverage by 0.03%. The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #342      +/-   ##
===========================================
- Coverage    75.25%   75.21%   -0.04%     
===========================================
  Files           21       21              
  Lines          683      682       -1     
===========================================
- Hits           514      513       -1     
  Misses         169      169              
Impacted Files Coverage Δ
flutter_cache_manager/lib/src/cache_manager.dart 88.31% <100.00%> (-0.15%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 30de944...2e9604a. Read the comment docs.