GeoNode / geonode

GeoNode is an open source platform that facilitates the creation, sharing, and collaborative use of geospatial data.
https://geonode.org/
Other
1.43k stars 1.12k forks source link

Download whole asset #12412

Closed mattiagiupponi closed 1 month ago

mattiagiupponi commented 1 month ago

Is your feature request related to a problem? Please describe. In some cases, the asset download will download only the main file, but not all the file which the asset is composed of.

Describe the solution you'd like The download link should provide the whole structure

Describe alternatives you've considered Two possibilities comes to my mind:

descr pro cons
We can let the download endpoint be able to zip the whole folder the first time and then serve it when is required the file is always available and we dont waste time for the zip we double the space occupied by the assets which could lead on missing disk space unless we create a routine which will delete the zip archive after some time (can be done as a periodic celery task)
We zip the whole asset structure everytime we can save up some disk space since we can delete the file after is served we waste computing time since we have to zip the file eveytime

Additional context I suggest that in both cases we can use the StreamingHttpResponse and FileWrapper. Here is a good example on how to implement it https://stackoverflow.com/a/8601118/7597536

cc @etj @giohappy

giohappy commented 1 month ago

@mattiagiupponi I would consider configuring the Nginx X-Accel-Redirect. This will offload the download to Nginx. You can find many guides online, also for Django.

mattiagiupponi commented 1 month ago

Anyway I guess we still need to zip the asset first before download it, no? Unless ngimx is able to point to a folder instead of a single file.

giohappy commented 1 month ago

yes, sure. Let's split the concerns and focus only on the zipping part for the moment. I would go for the first solution. The worst case is doubling the disk space, but disk space is cheaper than CPU and we can still set up a retention mechanism (as you say) which could also leave outside GeoNode.

etj commented 1 month ago

How/where would you store and manage the new zip file? Would the new file be considered as a whole new Asset?

giohappy commented 1 month ago

IMHO we will need to define a folder for the cached zip files, which is not publicly reachable.

giohappy commented 1 month ago

I think a solutions like zipstream-ng is required to avoid filling up memory, and its ZipStream in paritcular This comment from StackOverflow confirms my concern about the way you generate the zip