bradleyg / django-s3direct

Directly upload files to S3 compatible services with Django.
MIT License
652 stars 234 forks source link

Server side resource needed for s3direct? #201

Open BoPeng opened 4 years ago

BoPeng commented 4 years ago

We are running a website with a heavy component of user upload, and have got reports of failed upload, which we assume would be due to resource restraints during heavy upload.

Right now we let users upload to our server, we process and upload to S3, but we are thinking of letting the files go directly to S3 before we download from S3, process, and upload, if this would give users a more reliable upload experience.

My questions are: what are the resources needed on the server side if users are uploading files directly to S3? How much RAM would be needed if users are uploading large (a few GB) files? How the uploads scale to hundreds of simultaneous uploads? My hope is that the process would rely mostly on the user (JS) side so that there is minimal burden on the server side.

Many thanks in advance.

Andrew-Chen-Wang commented 4 years ago

Edit 2: It seems Boto3 allows this pre-signed POST request. Not sure about PUT... and how permissions would allow for it... but here's the docs: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-presigned-urls.html I'm also trying to upload several photos using an ArrayField with URLFields. I believe the S3 key that is returned on success of a request should contain some extra information like file name and location, correct?

@BoPeng is there any solution? I have an image-based app, and I've been scouring the internet for a solution to direct upload to S3 for users instead of using Django's PIL.verify() methodology. It seems like this is how we can do direct upload: https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingHTTPPOST.html

Looking at django-s3direct, they don't have this capability to offer the temporary signature for direct user upload. I'm doing a REST API heavy task, so this helped me more: https://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html

I'm currently looking into Django storages and Django s3 storages for solutions. Let me know of anything or if Boto3 has a nice feature that we could use!

Edit: no luck in Django-storages: https://github.com/jschneier/django-storages/issues/700

And Django s3 storage is very similar to django storages according to their docs. So it looks like we'd have to use boto3 directly or even worse DIY......

Some kind of implementation with node: https://softwareontheroad.com/aws-s3-secure-direct-upload/ and one for Rails: https://blog.bigbinary.com/2018/09/04/uploading-files-directly-to-s3-using-pre-signed-post-request.html all of which use some pre-signed-post-request. Although, I'm not quite sure how you could add certain permissions based on server-side code... at least, this is what the diagram and docs seem to present...

BoPeng commented 4 years ago

I have been checking the uppy/aws-s3 , which at least provides a frontend widget. I will have to figure out the django part if it works all right.