what-digital / divio

A project for tracking divio.com deployment issues
0 stars 0 forks source link

"divio project push media" fails for big media folders (several gbs of data) #16

Open vnosov93 opened 4 years ago

vnosov93 commented 4 years ago

Conversation with Divio Started on June 30, 2020 at 08:36 AM Stockholm time CEST (GMT+0200)

--- June 30, 2020 ---

08:36 AM | Vladimir Nosov: Hello, I have a problem with uploading media files. I need to upload 16 Gb of media files to server. I've tried to upload them locally and after that ran divio project push media test and I got the MemoryError. After that I tried to upload via aws cli (aws s3 cp). Also I added --acl public-read as written here https://docs.divio.com/en/latest/how-to/interact-storage/#storage-acls-access-control-lists to the aws s3 cp, but I cannot get my media files. I get 403 error in a browser. How can I upload media files

08:36 AM | Operator: Divio typically replies in a few hours.

09:44 AM | Mebrahtu from Divio: Hi Vladimir, Let me check regarding this with my colleague and get back to you. Regarding Mebrahtu

10:02 AM | Daniele from Divio: After you uploaded the files using the cp command of the AWS S3 CLI, were you able to verify that the files were present? If so, please use aws s3api put-object-acl to set the ACL on a particular object and test that in the browser. Then you can look at applying the ACL to all files.

06:36 PM | Vladimir Nosov: Yes, files were present there. I ran aws s3api put-object-acl --bucket bucket_name --key file_path --acl public-read. Next I ran get-object-acl and got: "Owner": { "ID": "..." }, "Grants": [ { "Grantee": { "ID": "...", "Type": "CanonicalUser" }, "Permission": "FULL_CONTROL" }, { "Grantee": { "Type": "Group", "URI": "http://acs.amazonaws.com/groups/global/AllUsers" }, "Permission": "READ" } ] } Next I looked at browser and got:

AccessDenied Access Denied 973A1E0DF99A6C5B 8YBoSGxUTIfKQ5PoDhIxlTrFt9Gq5QKkq1AOaeL5zAJGbjjLsMa0JuwYR5/0eIWcFQ+R4rs7xHQ=

--- July 1, 2020 ---

10:15 AM | Daniele from Divio: The ACL looks correct to me.  The public-read  ACL applies FULL_CONTROL  for the owner and READ for AllUsers . Are you 100% certain that you are hitting the correct URL? Perhaps you could share more details about the object, including URLs.

--- July 2, 2020 ---

04:37 AM | Vladimir Nosov: I used this command: aws s3api put-object-acl --bucket test.test-media.ch --key uploads/podcasts/2020/05/10/test.test.720x420_q90_crop_upscale.jpg --acl public-read --profile test-stage. I'm trying to get image via https://testpy3-test-d0b577161e554-8190916.divio-media.org/uploads/podcasts/2020/05/10/test.jpg.720x420_q90_crop_upscale.jpg

11:49 AM | Michal from Divio: Hi Vladimir, In the message above, it looks like you are uploading the media files to the test.test-media.ch bucket, which, as far as I can tell, is entirely under your control, i.e. not created through our infrastructure, but then you are trying to fetch the files from the bucket testpy3-test-d0b577161e554-8190916.divio-media.org, which is the default one created for the environment by us. Is there any reason why the bucket on divio-media.org should contain the file you are trying to fetch from it? Am I missing something here? Best regards, Michal


Exported from Divio on July 3, 2020 at 02:04 PM Stockholm time CEST (GMT+0200)

viktor-yunenko commented 4 years ago

@vnosov93 , will you be able to post your solution here?

sgordeychuk commented 3 years ago

I had the same issue many times on stage and live instances on two projects after media upload using the AWS CLI tool (about 10GB of media in my case) with --acl public-read. After the upload, I've found many 403 issues for images in random places. Eventually, I found that ACL sometimes not set after upload (it can be checked using aws s3api get-object-acl command), so it could be an AWS CLI issue or something else.

I've solved it by manually accessing (calling almost any method of Image object) all the images via custom management command to find the list of "access denied" images and then manually set ACL for them using the command mentioned above: aws s3api put-object-acl --bucket bucket_name --key file_path --acl public-read. Probably we could develop a script that loops over all images to check for 403 error and call the aws s3api command to fix AWS CLI. I'll try to implement it next time when I meet this issue.

cc @macolo @victor-yunenko

macolo commented 3 years ago

@vnosov93 is it possible that this is a duplicate of #36 ? If yes, can you mark this as duplicate and close this ticket? If no, can you change the issue title to clearly distinguish this from #36?