nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
27.27k stars 4.06k forks source link

Don't proxy S3 links #14675

Open alexandernst opened 5 years ago

alexandernst commented 5 years ago

Is your feature request related to a problem? Please describe. There is a use case for allowing files to be downloaded directly from S3 instead of being proxied by the instance that is running NextCloud. The first and biggest pro for this is cost reduction. The transfer from S3 to the internet is much much lower than from EC2 to the internet. The second pro is speed. S3 will serve the file as fast as possible, while your EC2 instance will server the file at a speed that could vary because of the CPU usage, network usage, instance type, etc...

Describe the solution you'd like It should be possible to let users access S3 files directly, instead of making the NextCloud download them from S3 and then serve them to the user. Also, for security reasons, links can be signed with a 1 minute expiration. This way files can be served only to logged in users.

Describe alternatives you've considered N/A

Additional context This feature requests is based on the premise that NextCloud is configured to use S3 and that the administrator has turned off end-to-end encryption.

ghost commented 2 years ago

MedVideos is willing to make a generous donation in Bitcoin or Monero for volunteers to develop this feature. If developers are interested, just answer here in this issue. Thanks.

eortegaz commented 1 year ago

Any news on this??

joshtrichards commented 1 year ago

Well, no one has proposed an actual implementation approach yet to my knowledge. The closest (in word form - not code) was @julienfastre's proposal in duplicate issue #17793, but I'm not sure where that went.

There will likely be a lot of side effects and gotchas that will have to be considered and covered, but even rough proof of concepts or experiment results would help things along.

As this is a community project, all are encouraged to propose possible approaches, bring up concerns about specific use cases (e.g. see https://github.com/nextcloud/server/issues/17793#issuecomment-633719011), or to do some experimentation (and report back!).

Ideally that's what this thread becomes a collection of - if there is truly enough interest in this functionality.

joshtrichards commented 8 months ago

Breadcrumbs for the future: Some consideration was given to this in #9178 when the directdownload endpoint was created, in terms of thinking about how this might be offered by allowing the underlying Storage to bubble up a direct link that would then be offered up:

We first ask the storage for a link (S3 for example could provide this directly). Otherwise we generate a 60 char token.

More work will be required.

ohthehugemanatee commented 2 months ago

The only prior art implementation I'm aware of is the Drupal s3fs module. (Drupal is also symfony based)

Internally drupal uses multiple file providers, eg public:// references the public filestore, usually a publicly accessible directory on the webserver. tmp:// is temporary files, private:// is not directly web accessible, etc.

When you register a new file provider, your implementing class provides methods for write, chown, copy, etc. There's a method for getExternalUri, which returns the Uri to be handed to the browser (or embedded in the page HTML as the case may be). For private filesystems this allows Drupal to proxy the file through a streamwrapper. For public there is often some URI rewriting that takes place eg if the files directory is outside of the webroot. For s3fs, this is where it generates an externally accessible (and time bound IIRC) URI for the file on S3.

You do have to be careful for CORS, and you have to get your s3 access privileges right, but it works great.

Unfortunately I don't know the analogous system in Nextcloud. As I understand it, when s3 is a primary storage, nextcloud uses it's own storage schema on s3, which doesn't map 1:1 between files and blobs. (Why?) That means that even if there is an equivalent getExternalUri method (I imagine there must be for encryption to work), there's no single blob to download. Nextcloud MUST be the middleperson there. Similar if nextcloud filesystemfilesystem encryption is used.

What is needed is primary object store implementation that stores files in S3 as blobs with a 1:1 relationship to files. Then the getExternalUri method can use the S3 getSignedUri method to generate the limited access direct S3 Uri with custom filename. Honestly not a terribly complicated bit of code once you have the 1:1 file storage.