Add the ability to automatically invalidate a cached sub-graph when the remote files change after being cached locally.
Motivation, pitch
Say I have multiple files stored in a remote object storage. These files are fed into a datapipe using FSSpecFileLister, and cached locally using .on_disk_cache. I want to invalidate the cache and re-compute the datapipe when one or more remote files are changed, probably based on their hash.
Alternatives
No response
Additional context
This feature request originated from this conversation on the Pytorch forum.
Note that I believe you can currently use extra_check_fn within .on_disk_cache to re-compute the hash and flag any difference. But that will not automatically delete or re-download the files.
🚀 The feature
Add the ability to automatically invalidate a cached sub-graph when the remote files change after being cached locally.
Motivation, pitch
Say I have multiple files stored in a remote object storage. These files are fed into a datapipe using
FSSpecFileLister
, and cached locally using.on_disk_cache
. I want to invalidate the cache and re-compute the datapipe when one or more remote files are changed, probably based on their hash.Alternatives
No response
Additional context
This feature request originated from this conversation on the Pytorch forum.