flux-framework / dyad

DYAD: DYnamic and Asynchronous Data Streamliner
GNU Lesser General Public License v3.0
7 stars 5 forks source link

remove a replicated file on close at consumer #68

Open JaeseungYeom opened 7 months ago

JaeseungYeom commented 7 months ago

As the most simple cache management, we need to remove the file used at consumer side at the earliest chance. If it is certain that a file is used only by a single consumer at a time, we can safely remove it on closing the file that has been open in read-only mode. If there is a chance that a file can be read by multiple consumers, we cannot remove on closing. In this case, we can either rely on a lock for shared reading, and only remove a file when there is no other consumer has placed a lock on the file. There might be another option by maintaining a reference counter and update it based on inotify We can expose an environment variable to enable a specific behavior.

This is for the simplest workflow scenario where a unique pair of producer and consumer shares files exclusively, and each file is read only once for the most part. For other scenarios, we would need more advanced cache management schemes.