aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
https://aws-sdk-pandas.readthedocs.io
Apache License 2.0
3.94k stars 701 forks source link

does_table_exist to work with lake formation permissions if table not exists #677

Closed tuannguyen0901 closed 3 years ago

tuannguyen0901 commented 3 years ago

Is your idea related to a problem? Please describe. Currently does_table_exist() results permission denied error: AccessDeniedException: An error occurred (AccessDeniedException) when calling the GetTable operation: Insufficient Lake Formation permission(s) on TABLE_WANT_TO_CHECK_BUT_NOT_EXIST

Since the table has not existed in the database yet, Lake Formation still insist to check permission before checking whether it exists first ?!?

Describe the solution you'd like Not sure what is the appropriate solution here since I'm not Lake Formation expert but I wish does_table_exist() working with this very common scenario

P.S. Don't attach files. Please, prefer add code snippets directly in the message body.

jaidisido commented 3 years ago

I believe your issue is caused by missing permissions on the database in Lake Formation for your role.

import awswrangler as wr
wr.catalog.does_table_exist(database="aws_data_wrangler_lakeformation", table="my_non_existent_table")

When I run the above with:

  1. An IAM role that has All, Alter, Create table, Describe, Drop permissions on the aws_data_wrangler_lakeformation database, it works just fine
  2. As soon as I remove the above permissions from the role, I receive the same error that you mention

Wrangler does not manage IAM roles or Lake Formation permissions, so it's up to the user to set them up correctly.