aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
https://aws-sdk-pandas.readthedocs.io
Apache License 2.0
3.85k stars 681 forks source link

Insight into error `awswrangler.exceptions.QueryFailed: Iceberg cannot access the requested resource` #2823

Closed plbremer closed 1 month ago

plbremer commented 1 month ago

Hi. I received the following error when trying to use athena_to_iceberg. I was wondering if anyone had any insights into what the root cause might be - or at least how to get more information about the error?

I suspect that there might be some permissions issues - I am trying to use AccesAnalyzer to look at what might be needed. With that in mind, Iceberg is the thing that does not have permissions - not me?

Any insight would be appreciated.

This:

lab_data_wrangler_session = boto3.Session(profile_name='COMPANY-lab-data-wrangler',region_name='us-west-2')
lab_data_wrangler_s3 = lab_data_wrangler_session.client('s3')

wr.athena.to_iceberg(
    # df=total_panda,
    df=pd.DataFrame({'col': [1, 2, 3]}),
    database='COMPANY_lab_data_lake_staging',
    table='junk_table',
    table_location='s3://COMPANY-lab-data-lake/temp/iceberg/',
    temp_path='s3://COMPANY-lab-data-lake/temp/iceberg-temp/',
    boto3_session=lab_data_wrangler_session,
    s3_output='s3://COMPANY-lab-a2p-commons-metabase-output/NAMEs-directory/'
)

gives:

Traceback (most recent call last):
  File "/Users/NAME/coding_projects/COMPANY/a2p-commons/src/python/commons/tools/gsheet_to_lake.py", line 165, in <module>
    dataframe_to_iceberg(
  File "/Users/NAME/coding_projects/COMPANY/a2p-commons/src/python/commons/tools/gsheet_to_lake.py", line 69, in dataframe_to_iceberg
    wr.athena.to_iceberg(
  File "/Users/NAME/Library/Caches/pypoetry/virtualenvs/COMPANY-data-commons-mOl3hlmm-py3.11/lib/python3.11/site-packages/awswrangler/_config.py", line 715, in wrapper
    return function(**args)
           ^^^^^^^^^^^^^^^^
  File "/Users/NAME/Library/Caches/pypoetry/virtualenvs/COMPANY-data-commons-mOl3hlmm-py3.11/lib/python3.11/site-packages/awswrangler/_utils.py", line 178, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/NAME/Library/Caches/pypoetry/virtualenvs/COMPANY-data-commons-mOl3hlmm-py3.11/lib/python3.11/site-packages/awswrangler/athena/_write_iceberg.py", line 391, in to_iceberg
    _create_iceberg_table(
  File "/Users/NAME/Library/Caches/pypoetry/virtualenvs/COMPANY-data-commons-mOl3hlmm-py3.11/lib/python3.11/site-packages/awswrangler/athena/_write_iceberg.py", line 82, in _create_iceberg_table
    wait_query(query_execution_id=query_execution_id, boto3_session=boto3_session)
  File "/Users/NAME/Library/Caches/pypoetry/virtualenvs/COMPANY-data-commons-mOl3hlmm-py3.11/lib/python3.11/site-packages/awswrangler/_config.py", line 715, in wrapper
    return function(**args)
           ^^^^^^^^^^^^^^^^
  File "/Users/NAME/Library/Caches/pypoetry/virtualenvs/COMPANY-data-commons-mOl3hlmm-py3.11/lib/python3.11/site-packages/awswrangler/athena/_executions.py", line 237, in wait_query
    raise exceptions.QueryFailed(response["Status"].get("StateChangeReason"))
awswrangler.exceptions.QueryFailed: Iceberg cannot access the requested resource
kukushking commented 1 month ago

Hi @plbremer does this consistently reproduce? Does your role have S3 permissions to both COMPANY-lab-data-lake and COMPANY-lab-a2p-commons-metabase-output buckets, as well as KMS, and Glue permissions? Have you tried running it as Admin?

The error is coming from the Athena service and unfortunately lacking information that would help to debug. You might be able to find more error details in CloudTrail, otherwise I suggest to raise AWS support request and include query id that would help service team with RCA.

plbremer commented 1 month ago

@kukushking thanks for taking the time to respond

for all those curious, I am not sure which permissions exactly did the trick, but I can say that adding

resolved this issue