flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.39k stars 576 forks source link

[BUG] Flytekit is using `no-sign-request` when using `pyflyte-fast-execute` #2771

Open stephen37 opened 2 years ago

stephen37 commented 2 years ago

Describe the bug

When using fast-registration, during the initialisation of the task, we copy the code from S3 to the container. Flytekit is using pyflyte-fast-execute in that case, the problem is that by default we add the argument --no-sign-request when using AWS CLI for S3 and that makes it impossible to download the content from private S3 buckets.

Expected behavior

Not being anonymous by default, we shouldn't add the argument --no-sign-request by default

Additional context to reproduce

  1. Run pyflyte-fast-executewith the needed arguments
    
    ❯ pyflyte-fast-execute --additional-distribution s3://<private-bucket-name>/ea/<project-name>/<domain>/path_to.tar.gz --dest-dir /src

{"asctime": "2022-08-16 14:21:55,941", "name": "flytekit", "levelname": "ERROR", "message": "Error from command '['aws', '--no-sign-request', 's3', 'cp', 's3:///ea///path_to.tar.gz', '/src']':\nb'fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden\n'\n"} {"asctime": "2022-08-16 14:21:55,941", "name": "flytekit", "levelname": "ERROR", "message": "Exception when trying to execute ['aws', 's3', 'cp', '3:///ea///path_to.tar.gz', '/src'], reason: Called process exited with error code: 1. Stderr dump:\n\nb'fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden\n'"}

2. It can also be reproduced by using the AWS CLI commands directly 
When using `--no-sign-request`
```shell
❯ aws --no-sign-request s3 cp s3://<private-bucket-name>/ea/<project-name>/<domain>/path_to.tar.gz .

fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden

When not using --no-sign-request

❯ aws s3 cp s3://<private-bucket-name>/ea/<project-name>/<domain>/path_to.tar.gz .

download: s3://<private-bucket-name>/ea/<project-name>/<domain>/path_to.tar.gz to ./fast14de81dc9a6b0f459a1ab49e1d871e01.tar.gz

Screenshots

No response

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

stephen37 commented 2 years ago

I commented out the line https://github.com/flyteorg/flytekit/blob/2ca5607d6b313fa9d6aeb88a807febcefcea7cdb/flytekit/extras/persistence/s3_awscli.py#L50 on my personal fork and I confirm that it works now.

From my point of view I don't know the reason as to why we would get rid of credentials on purpose but I might be missing something, let me know how you solve the problem 😄

pingsutw commented 2 years ago

At line 43, flytekit executes the command aws s3 cp .. without --no-sign-request. If it fails, then flytekit will add --no-sign-request. We add that because some users may want to access the public bucket, which requires us to add -no-sign-request in the command.

In the first API call, we'll always run the command with credentials. However, It seems like you always got errors at line 43. Did you have any other error messages?

stephen37 commented 2 years ago

Hey, thanks for your reply. You're indeed correct, it seems like the first time we use it we get a permission error so I guess that's why we call it with --no-sign-request

Some logs

{"asctime": "2022-08-22 10:36:42,250", "name": "flytekit", "levelname": "ERROR", "message": "Error from command '['aws', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app']':\nb\"download failed: s3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz to ./fast144a961300c38a21396
0b1abe4153f24.tar.gz [Errno 13] Permission denied: '/app/fast144a961300c38a213960b1abe4153f24.tar.gz.5bBc3186'\\n\"\n"}
{"asctime": "2022-08-22 10:36:42,761", "name": "flytekit", "levelname": "ERROR", "message": "Error from command '['aws', '--no-sign-request', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app']':\nb'fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden\\n'\n"}
{"asctime": "2022-08-22 10:36:42,762", "name": "flytekit", "levelname": "ERROR", "message": "Exception when trying to execute ['aws', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app'], reason: Called process exited with error code: 1.  Stderr dump:\n\nb'fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden\\n'"}
{"asctime": "2022-08-22 10:36:48,427", "name": "flytekit", "levelname": "ERROR", "message": "Error from command '['aws', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app']':\nb\"download failed: s3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz to ./fast144a961300c38a21396
0b1abe4153f24.tar.gz [Errno 13] Permission denied: '/app/fast144a961300c38a213960b1abe4153f24.tar.gz.F834DF7F'\\n\"\n"}
{"asctime": "2022-08-22 10:36:48,935", "name": "flytekit", "levelname": "ERROR", "message": "Error from command '['aws', '--no-sign-request', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app']':\nb'fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden\\n'\n"}
{"asctime": "2022-08-22 10:36:48,935", "name": "flytekit", "levelname": "ERROR", "message": "Exception when trying to execute ['aws', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app'], reason: Called process exited with error code: 1.  Stderr dump:\n\nb'fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden\\n'"}
{"asctime": "2022-08-22 10:36:54,617", "name": "flytekit", "levelname": "ERROR", "message": "Error from command '['aws', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app']':\nb\"download failed: s3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz to ./fast144a961300c38a21396
0b1abe4153f24.tar.gz [Errno 13] Permission denied: '/app/fast144a961300c38a213960b1abe4153f24.tar.gz.2B34FFEE'\\n\"\n"}
{"asctime": "2022-08-22 10:36:55,123", "name": "flytekit", "levelname": "ERROR", "message": "Error from command '['aws', '--no-sign-request', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app']':\nb'fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden\\n'\n"}
{"asctime": "2022-08-22 10:36:55,123", "name": "flytekit", "levelname": "ERROR", "message": "Exception when trying to execute ['aws', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app'], reason: Called process exited with error code: 1.  Stderr dump:\n\nb'fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden\\n'"}
{"asctime": "2022-08-22 10:37:00,787", "name": "flytekit", "levelname": "ERROR", "message": "Error from command '['aws', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app']':\nb\"download failed: s3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz to ./fast144a961300c38a21396
0b1abe4153f24.tar.gz [Errno 13] Permission denied: '/app/fast144a961300c38a213960b1abe4153f24.tar.gz.d609eFb8'\\n\"\n"}
{"asctime": "2022-08-22 10:37:01,388", "name": "flytekit", "levelname": "ERROR", "message": "Error from command '['aws', '--no-sign-request', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app']':\nb'fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden\\n'\n"}
{"asctime": "2022-08-22 10:37:01,388", "name": "flytekit", "levelname": "ERROR", "message": "Exception when trying to execute ['aws', 's3', 'cp', 's3://s3-bucket-name/gx/flyte-neural-search/development/VBMTJHOOFFMXSXOBNMFJA2PPAU======/fast144a961300c38a213960b1abe4153f24.tar.gz', '/app'], reason: Called process exited with error code: 1.  Stderr dump:\n\nb'fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden\\n'"}
wild-endeavor commented 2 years ago

Yeah @stephen37 we couldn't figure out a way around this. Some of the things we pull from s3 are public, and some are private. There's no fallback option in the aws cli, at least that we could find at time of writing. So we just have to make two calls.

We can try to clean up the logging so that it's less of a red-herring.

github-actions[bot] commented 1 year ago

Hello 👋, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 12 months ago

Hello 👋, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 1 month ago

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! 🙏