ex-aws / ex_aws

A flexible, easy to use set of clients AWS APIs for Elixir
https://hex.pm/packages/ex_aws
MIT License
1.28k stars 527 forks source link

Got `ExpiredToken` error when `ExAws.S3.list_objects("my-bucket") |> ExAws.stream!` runs for a long time #824

Closed dsdshcym closed 1 year ago

dsdshcym commented 2 years ago

Environment

Current behavior

Currently, ExAws.stream! would pass a new config struct to the lazy function, https://github.com/ex-aws/ex_aws/blob/87ec5641e53983fc627918a336d1a2c310489b70/lib/ex_aws.ex#L105 when combined with list_objects, this config (and the token inside this config) would be reused to fetch all the pages, but lazily

If we pass the list_objects stream to a long running Task.async_stream, the token might expire when fetching a new page:

ExAws.list_objects(...)
|> ExAws.stream!
|> Task.async_stream(fn object -> Process.sleep(24h) end)

And the error would be:

** (ExAws.Error) ExAws Request Error!

{:error, {:http_error, 400, %{body: "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Error><Code>ExpiredToken</Code><Message>The provided token has expired.</Message><Token-0>...</Token-0><RequestId>...</RequestId><HostId>...</HostId></Error>", headers: [...], status_code: 400}}}

    (ex_aws 2.2.3) lib/ex_aws.ex:87: ExAws.request!/2
    (ex_aws_s3 2.3.0) lib/ex_aws/s3/lazy.ex:7: anonymous fn/4 in ExAws.S3.Lazy.stream_objects!/3
    (ex_aws_s3 2.3.0) lib/ex_aws/s3/lazy.ex:18: anonymous fn/2 in ExAws.S3.Lazy.stream_objects!/3
    (elixir 1.12.3) lib/stream.ex:1531: Stream.do_resource/5
    (elixir 1.12.3) lib/stream.ex:1719: Enumerable.Stream.do_each/4
    (elixir 1.12.3) lib/task/supervised.ex:336: Task.Supervised.stream_reduce/7
    (elixir 1.12.3) lib/stream.ex:649: Stream.run/1

Expected behavior

No ExpiredToken error would be raised, no matter how much time the stream needs to be processed.

I'd propose we fetch a new token every time we request a new page from S3.

dsdshcym commented 2 years ago

Same issue happens to S3.Download as well: when downloading some large files from S3, if the download starts just before the token expiration, the download for later chunks might raise ExpiredToken error (since the token was issued when requesting the first chunk)

bernardd commented 2 years ago

I'd propose we fetch a new token every time we request a new page from S3.

That's probably overkill (and not especially efficient). We know what time the token expires, so it should be sufficient to check it before each use and refresh it if it's going to expire within the next [some short period of time].

I'd welcome a PR to that effect.

dsdshcym commented 2 years ago

Hi @bernardd, Thank you for your response!

check it before each use and refresh it if it's going to expire within the next

Actually, this is already the case now. That's the whole purpose of the AuthCache module: https://github.com/ex-aws/ex_aws/blob/main/lib/ex_aws/config/auth_cache.ex

But when we call ExAws.stream! or S3.Download, they would accept a token from the start and reuse this token through the whole stream. So if this stream runs overtime, the token stored from the start would expire.

So a better solution might be to call AuthCache every time we need to make a request inside ExAws.stream! or S3.Download, instead of reusing the same token all the time.

I don't have enough time to submit a PR recently, so I opened this issue and hoped someone may help.

bernardd commented 2 years ago

Right, yeah, that sounds like the right solution. Probably like you I'm buried under a pile of work at the moment so it isn't likely I'll get to it myself in the short term.

dsdshcym commented 2 years ago

No worries, I'll see what I can do when I get some time (Or someone else may pick it up 😆)

cjbottaro commented 2 years ago

Any update on this? We're running into the exact same issue; creds expiring after 6 hours, but our ExAws.stream! is taking longer than that.