Deadwood-ai / file-storage-api

FastAPI backend for the Deadwood-AI file storage server
GNU General Public License v3.0
0 stars 0 forks source link

File exists strategy #2

Closed mmaelicke closed 5 months ago

mmaelicke commented 5 months ago

There is a TODO in the code:

https://github.com/Deadwood-ai/file-storage-api/blob/aebffb62570f421b68ad2085366d5f876bd0ae28/storage/app.py#L63

This affects the case, when the user uploads a file, which is already present on the file storage server. Right now, the file names on the users file system are preserved and saved using the same name on the storage server.

I think these kinds of conflicts can arise, ie. when drones, or drone-software, follow some common pattern to name the file, we might end up with a my-drone-image-1.tif, my-drone-image-2.tif per user or even per flight.

There are different options here:

  1. Each user uploads into his/her own folder, so name conflicts are per-user and can be handled with overwrite or exception.
  2. We assign each file a new (random) filename and the upload filename is preserved as metadata (upload name?)
  3. We assign each file a new (random) filename and do not care for the original file (weird).
  4. We prefix each upload with a date string.
JesJehle commented 5 months ago

I would propose a filename combination of name + uuid to identify the image on the file server and store the name, and the hash of the image together with all other metadata in the database. This way

What do you think?

mmaelicke commented 5 months ago

Even better. UUIDs are also great to ensure computational provenance for later workflows.