Open fedorov opened 1 year ago
As I was working on reporting download progress in idc-index, I remembered this. An elegant way to handle not downloading what is already downloaded may be to use sync feature in s5cmd. What do you think of this? We could implement right in idc-index, so downloading (or I say syncing in the future) might be faster https://github.com/peak/s5cmd?tab=readme-ov-file#sync
Content already downloaded will be downloaded again if requested by the user.
This is a bit tricky, since we cannot rely on UIDs alone to confirm whether the binary content is the same in the DICOM DB as in IDC. Maybe should keep track of what was downloaded already in a separate cache, and keep the hash? Also need to check if the hash is returned by the API.
There is also a situation where user might have downloaded via the extension, but then deleted from the DICOM DB, or (highly unlikely, but not impossible) deleted from DICOM DB, but then imported into the DB instance that has the same UIDs, but is of a different version and has a different hash...