Deadwood-ai / deadwood-api

Main FastAPI application for the deadwood backend
GNU General Public License v3.0
0 stars 0 forks source link

More descriptive filename for visualization and download #56

Open JesJehle opened 2 months ago

JesJehle commented 2 months ago

At the moment, neither the file_name with its huge uuid nor the file_aliases because of the revealed private data are ideal for a file_name to be displayed on the application of the given at the time of download.

Teja made a good suggestion to use a combination of dataset_id + admin_level_1 + admin_level3 + datestring.

I think it is cleaner to implement this in the backend rather than just in the frontend. Is there any reason not to update the file_name to the given solution? If not, I would implement this in the metadata_route and update the entries in the database. @cmosig @mmaelicke What do you think?

mmaelicke commented 2 months ago

I guess this is fine, but maybe we add that information, instead of replacing the file_name. If the file_name is not in the database anymore, we loose this information entirely.

We could therefore just keep the file_name, but only expose it via the API to authenticated users. A possible place where we might want to keep it, is when the user scans his/her uploaded files, displaying the file_name as uploaded might make it easier to identify a specific file.

cmosig commented 2 months ago

Yes, please keep the filename to be able to backtrack data to the original database.

cmosig commented 2 months ago

Sounds good otherwise.

You may want to think about how filenames of multiple labels sets per orthophoto are handled. Maybe add the label_id if there is one.

JesJehle commented 2 months ago

The file_name only corresponds to the ortho. The labels are something different. By using the dataset_id in the name, the name should be unique.

cmosig commented 2 months ago

What do you do about the labels then?

There needs to be a deterministic way to get from the orthophoto filename to the label filename locally, without having to lookup a mapping in the supabase.

JesJehle commented 2 months ago

Good Question. We could implement the same logic in the labels table and create a labels_name based on: dataset_id + admin_level_1 + admin_level3 + label_id But i not entirely happy with this...

cmosig commented 2 months ago

Seems fine to me. It's good if the prefix is the same between orthophoto and label file. Then pattern matching is easy.

We also need to think about supplying users with a metadata table/csv. This is relevant if more than one dataset is downloaded.

JesJehle commented 2 days ago

Since we want to expose the files via the file server (orthos, labels, cogs are already exposed) the files need to be stored with the actual correct file name: dataset_id + admin_level_1 + admin_level3 + datestring.tif same for the labels. So we would need to rename all files anyway; hence I am getting comfortable with the idea, to completely reimport all data. This would also enable us to rethink parts of the architecture. I would make a separate issue for this.

cmosig commented 2 days ago

hence I am getting comfortable with the idea, to completely reimport all data

Would be fine for me. Some metadata is also missing anyways

JesJehle commented 2 days ago

Yes, and we still need to find a better configuration for the cogs to make the background transparent in all images.