apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.58k stars 3.54k forks source link

[Python][CI] Some nightly jobs are failing due to ACCESS_DENIED to S3 bucket #33017

Closed asfimport closed 2 years ago

asfimport commented 2 years ago

The following nitghly failures:

Reporter: Raúl Cumplido / @raulcd Assignee: Jacob Wujciak / @assignUser

Note: This issue was originally created as ARROW-17791. Please see the migration documentation for further details.

asfimport commented 2 years ago

Antoine Pitrou / @pitrou: I've seen that too. Can you restart some of those jobs to see if it's sporadic?

asfimport commented 2 years ago

Raúl Cumplido / @raulcd: I've re-run the test-conda-python-3.10 one and it failed on the Retry too:

https://github.com/ursacomputing/crossbow/actions/runs/3094438413

asfimport commented 2 years ago

Joris Van den Bossche / @jorisvandenbossche: It has been failing for multiple days, so doesn't seem sporadic

asfimport commented 2 years ago

Joris Van den Bossche / @jorisvandenbossche: For the test-conda-python-3.8-pandas-latest build:

asfimport commented 2 years ago

Joris Van den Bossche / @jorisvandenbossche: I was checking further differences between both runs based on the logs, and the failing build has those additional env variables set:


env:
  ...
  AWS_SECRET_ACCESS_KEY: ***
  AWS_ACCESS_KEY_ID: ***
  SCCACHE_BUCKET: ***
  SCCACHE_S3_KEY_PREFIX: sccache

and those are not present in the working build. So probably something to do with the sccache change? (https://github.com/apache/arrow/pull/13556, cc @assignUser)

asfimport commented 2 years ago

Antoine Pitrou / @pitrou: Ahah, looks like we'll need to use non-standard env var names for sccache (such as SCCACHE_S3_ACCESS_KEY, SCCACHE_S3_SECRET_KEY)

asfimport commented 2 years ago

Jacob Wujciak / @assignUser: Renaming the envvars is not an option because sccache would not detect them then. But we found the issue, the sccache user needs explicit permission to access any bucket. We have now added this and will add any other buckets that need to be accessed in jobs that use sccache.

Successful run here: https://github.com/ursacomputing/crossbow/actions/runs/3094438413/jobs/5047216106