scanner-research / scanner

Efficient video analysis at scale
https://scanner-research.github.io/
Apache License 2.0
615 stars 108 forks source link

Scanner database files get incorrectly cached when public on GCS #219

Open willcrichton opened 6 years ago

willcrichton commented 6 years ago

If a bucket has public ACLs by default, or if anything else causes files like db_metadata.bin to get set to public by default, then Google Cloud automatically applies a caching mechanism: https://cloud.google.com/storage/docs/consistency#cache-control

This causes incorrect behavior where writes to the metadata files aren't seen until after the cache expires, e.g. on ingest, the video metadata is successfully written, but subsequent attempts to read that metadata fail until an hour later.

The best solution is to expose cache control as a part of Storehouse, and have Scanner prevent caching of all metadata files.