ove / ove-asset-manager

Asset Management for data and static resources
MIT License
0 stars 0 forks source link

Make it possible to download an entire project from the project list page #158

Open senakafdo opened 5 years ago

senakafdo commented 5 years ago

This is a useful feature if someone wants to replicate a project locally and serve it with a simple webserver such as nginx/node/python. It might also be useful to package a standard README.md or README.txt file which includes a brief write-up on how to setup these local hosting options and a list of popular edits they may need to make to the files contained within the project.

jamesscottbrown commented 4 years ago

This feature would be nice, but I don't think it is necessary: the projects are just buckets in an S3 store, and so someone can download them from the store directly using their preferred tool, e.g. rclone or s2cmd at the commandline, or Cyberduck/DragonDisk/CrossFTP if you want a GUI (or, as we're currently using Minio, you can rsync from the underlying file system).

There is also the issue that a project or asset may be very large (e.g., the set of image tiles for the In2White image), so trying to zip up everything may not be trivial.

senakafdo commented 4 years ago

the projects are just buckets in an S3 store, and so someone can download them from the store directly using their preferred tool

Can a "user" not the maintainers of the infrastructure get direct access to S3?

There is also the issue that a project or asset may be very large

If a project was uploaded using the AM, there should be a way of downloading it. But, we can also set some limits, and inform the user to discuss with the administrators if it exceeds a certain threashold. This issue is about the 99% of the assets stored on the AM not the 1% exceptional case.

jamesscottbrown commented 4 years ago

Can a "user" not the maintainers of the infrastructure get direct access to S3?

If the bucket is public (which only a few curently are), then anyone on the Imperial network can access them.

This issue is about the 99% of the assets stored on the AM not the 1% exceptional case

Yes, but it is dangerous to add a button that will crash or hang the AM if it is pressed when an 'exceptional' project is selected; the feature needs to be implemented such that this never happens (even if this means that an error message is displayed rather than the download proceeding), which complicates the implementation.

oserban commented 4 years ago

I agree it's dangerous to add this action for very large projects and this is why this has been postponed for future releases.

senakafdo commented 4 years ago

Yes, but it is dangerous to add a button that will crash or hang the AM if it is pressed when an 'exceptional' project is selected

Well, you should not let that button be pressed. So, actually speaking the AM needs to determine the risk and block the attempt, or in other words it needs to know what kind of file size/count it can handle and prevent the attempt.

I agree that this may require more than usual effort and I'm fine for adding this later, but I must note that this is a reasonable usability limitation.

For example, I have been testing a number of X-Type visualisations which are not on the standard gallery project of X-Type. Each and every time I do this, I need to fetch the content from the AM and modify it or maintain a permanent local replica. So, this model helps someone upload a project, but if they plan to improve it, they will have to maintain a local copy.

oserban commented 4 years ago

The problem is that you cannot check reliably in advance the size of a bucket. Moreover, if it's not standard s3 api to download an entire bucket, so in order to implement this, we need to store the archive onto the file system for each session. It's a bit more complex than it needs to be for such a simple feature unfortunately.

senakafdo commented 4 years ago

No, I wouldn't go that far. Right now we should have a way of preventing users listing the contents that were uploaded as a part of a large zip file. You can fetch the size of a file using the S3 API, so the problem could be simplified if properly implemented.

oserban commented 4 years ago

In order to prevent the download of very large buckets, you need to reliably check the size of that bucket. At the moment people use the file api you mentioned to do this recursively, but for large file trees (such as DZI), this would be quite slow. Plus, there is no way of doing this on the storage side, so you'll have to implement it on the web service side.

senakafdo commented 4 years ago

but for large file trees (such as DZI), this would be quite slow

Yes, this is what I implied by "we should have a way of preventing users listing the contents that were uploaded as a part of a large zip file". So, first we get the file count and if that does not exceed a maximum, we thereafter check the filesizes and then build the zip file like you would do it right now.

So, what I'm asking is not a solution that would work for all use-cases, at least 99% of the projects that endusers would want to download would have fewer files and of reasonable sizes. Perhaps 99% could be an exaggeration, this is good even if it meets 80% of the requirement. Right now, we have no easy way of downloading a project and that impacts usability.

But anyway we can prioritise this accordingly.