Closed Heineb closed 2 years ago
Hello @Heineb could you share with me the variables used to create the Terraform stack? Maybe an existent VPC or S3 Bucket doesn't have the required permission.
Hi Doug - Thanks for the swift response and a great repo.
I'm using your script and have only changed the login passwords and hardcoded the RDS password. The rest is left untouched meaning new VPCs and bucket have been created. I've validated the bucket and IAM permissions and they are set as in the script.
When I check cloud watch I get these error messages: [2022-05-18 07:54:30 +0000] [13] [CRITICAL] WORKER TIMEOUT (pid:28)
I tried to increase timeout by setting [GUNICORN_CMD_ARGS]="--timeout 120" - but no luck.
Bucket policy: { "Version": "2012-10-17", "Statement": [ { "Sid": "Statement1", "Principal": {}, "Effect": "Allow", "Action": [], "Resource": [] } ] }
IAM policy: { "Statement": [ { "Action": [ "s3:ListBucket", "s3:HeadBucket" ], "Effect": "Allow", "Resource": [ "arn:aws:s3:::mlflow-dev-20220518065838693500000001" ] }, { "Action": [ "s3:ListBucketMultipartUploads", "s3:GetBucketTagging", "s3:GetObjectVersionTagging", "s3:ReplicateTags", "s3:PutObjectVersionTagging", "s3:ListMultipartUploadParts", "s3:PutObject", "s3:GetObject", "s3:GetObjectAcl", "s3:GetObject", "s3:AbortMultipartUpload", "s3:PutBucketTagging", "s3:GetObjectVersionAcl", "s3:GetObjectTagging", "s3:PutObjectTagging", "s3:GetObjectVersion" ], "Effect": "Allow", "Resource": [ "arn:aws:s3:::mlflow-dev-20220518065838693500000001/*" ] } ], "Version": "2012-10-17" }
Only thing is I get an error in the policy like this: Ln 6, Col 16 Invalid Action: The action s3:HeadBucket does not exist. Did you mean s3:ListBucket? The API called HeadBucket authorizes against the IAM action s3:ListBucket._
Hi @Heineb thank you! :)
Let's try to remove all these actions and add an S3FullAccess policy to the IAM role.
I also recommend you to check in AWS App Runner > Application logs, to get logs with more details.
Hi Doug,
Thanks for your support - I tried the IAM policy simulator and got error messages referring to the accounts SCP permissions. So I'm conferring with our cloud team that handles account level settings.
Hi Doug,
Thanks for your support - I tried the IAM policy simulator and got error messages referring to the accounts SCP permissions. So I'm conferring with our cloud team that handles account level settings.
@Heineb awesome buddy! Good luck and enjoy your MLflow server. :)
Hi @DougTrajano ,
Sorry for reopening this issue - but it appears it wasn't an SCP issue. A tad more information - the frontend returns this error message: "status: 503, text: 'upstream connect error or disconnect/reset before headers. reset reason: connection termination'" - Have you experienced this before?
Hey buddy! Don't worry about reopening this issue.
Actually, I figured out what is the root cause and fixed it in the last commits.
Essentially, the VPC needs to have a VPC Endpoint configured to enable Amazon S3 endpoints access.
Please, try again with the new stack on the main branch and tell me if you have any issues.
Hi - Thanks for the swift response. Great news. We'll try the new stack and let you know 👍
When listing artifacts I receive this error message:
Loading Artifacts Failed Unable to list artifacts stored under
{artifactUri}
for the current run. Please contact your tracking server administrator to notify them of this error, which can happen when the tracking server lacks permission to list artifacts under the current run's root artifact directory._