Closed jayceslesar closed 5 months ago
Hi @jayceslesar ,
Thank you for reaching out.
Current user is not authenticated with IAM credentials
I do not see the AWS temporary IAM credentials being passed to your call to connect()
, directly nor via AWS profile. Passing user
and password
will result in native database authentication, despite setting IAM=True
, which does not satisfy the IAM requirement. Please try using AWS profile or passing the IAM credentials directly and let me know if the issue persists. Tutorials can be found here for IAM connections.
Thanks for the fast response, it seems that reinforces/confirms To access the awsdatacatalog database from a cluster, you must use the authentication method Temporary credentials using your IAM identity. For more information about this authentication method, see [Connecting to an Amazon Redshift database](https://docs.aws.amazon.com/redshift/latest/mgmt/query-editor-v2-using.html#query-editor-v2-connecting). Your query editor v2 administrator might need to configure the Account settings for the account to display this authentication method on the connection window.
? If that is the case, then redshift should probably raise a better error + documentation around doing this type of stuff should be cleaned up and is out of scope for this project.
thanks for the feedback, Jayce. There should be a Provide feedback
at the bottom of each AWS docs page where you can provide feedback for the AWS documentation. Closing this issue out, but please let me know if there's anything else I can help with.
@Brooke-white I see the same error when specifying temporary credentials as done in the tutorial you linked to.
import redshift_connector
conn = redshift_connector.connect(
iam=True,
database='dev',
db_user='awsuser',
password='',
user='',
cluster_identifier='<redacted>',
profile='<redacted>',
auto_create=True
)
cursor = conn.cursor()
cursor.execute("SHOW SCHEMAS FROM DATABASE awsdatacatalog;")
Results in
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/gshand/code/adhoc/redshift_connector/.venv/lib/python3.11/site-packages/redshift_connector/cursor.py", line 241, in execute
self._c.execute(self, operation, args)
File "/Users/gshand/code/adhoc/redshift_connector/.venv/lib/python3.11/site-packages/redshift_connector/core.py", line 1819, in execute
self.handle_messages(cursor)
File "/Users/gshand/code/adhoc/redshift_connector/.venv/lib/python3.11/site-packages/redshift_connector/core.py", line 2005, in handle_messages
raise self.error
redshift_connector.error.ProgrammingError: {'S': 'ERROR', 'C': '42501', 'M': 'Error retrieving columns from the auto-mounted catalog: Current user is not authenticated with IAM credentials.', 'F': '../src/redcat/redcat_utils.cpp', 'L': '409', 'R': 'validateGlueAutoMountConfigParams'}
I can also verify that the automounted awsdatacatalog
db on the target cluster is able to be accessed via temporary IAM credentials in the Redshift Query Editor V2.
If you check https://docs.aws.amazon.com/redshift/latest/dg/r_STL_CONNECTION_LOG.html what is shown for the entry pertaining to your connection attempt in the authmethod
column?
It is showing IAM AssumeUser
, the same as the working credentials from query editor V2.
So, it seems like this must be something on AWS side. I will open a support ticket with them, and report back if there's anything that might be relevant to this package.
Thanks @Brooke-white, appreciate you looking into this.
Thanks, Grayden. Closing this out but please re-open if needed.
Hi @Brooke-white,
I've finally got some useful feedback from AWS support.
For provisioned redshift clusters , you will now need to use get-cluster-credentials-with-iam API. This is the API which need to be used while accessing awsdatacatalog from JDBC.
[+] GetClusterCredentialsWithIAM - https://docs.aws.amazon.com/redshift/latest/APIReference/API_GetClusterCredentialsWithIAM.html
You will need to modify the code to call above API and use the credentials returned using above API call for connection, accordingly for provisioned and serverless redshift types.
It does look like this API call is supported by redshift_connector, https://github.com/aws/amazon-redshift-python-driver/blob/master/redshift_connector/iam_helper.py#L39-L63. However, it seems the way we're invoking the connect method, it's defaulting to the IAM_V1
method.
One other inconsistency I'm seeing is that with the IAM_V2
method (GetClusterCredentialsWithIAM), the database user is set automatically by redshift.
The database user is mapped 1:1 to the source AWS Identity and Access Management (IAM) identity.
So the db_user
parameter should be None
. However, if I try to leave db_user=None
, I receive the following error:
redshift_connector.error.InterfaceError: Connection parameter db_user must be specified when using IAM authentication
Curious to get your thoughts on this!
One more update, I've confirmed that the problem I describe above is the root cause of this issue.
The following snippet calls the GetClusterCredentialsWithIAM
api directly, and bypasses the redshift_connector IAM auth. It's able to complete the query that was failing originally.
from redshift_connector import connect
import boto3
import json
redshift = boto3.client("redshift")
resp = redshift.get_cluster_credentials_with_iam(
DbName='dev',
ClusterIdentifier='<redacted>,
DurationSeconds=1000,
)
db_user = resp['DbUser']
db_password = resp['DbPassword']
conn = connect(
database="dev",
user=db_user,
password=db_password,
host="<redacted>"
)
cursor = conn.cursor()
cursor.execute("SHOW SCHEMAS FROM DATABASE awsdatacatalog;")
result = cursor.fetchall()
print(result)
I think as a fix, I would suggest making the following change:
iam=True
, but dbuser=None
, use the IAM_V2 method (GetClusterCredentialsWithIAM)I can open up a draft MR for this change if helpful.
Hi @graydenshand , to use GetClusterCredentialsWithIAM
please add the connection parameter group_federation=True
. It is required for use of GetClusterCredentialsWithIAM
. The IAM_V2 API does not allow the user to provide a user as the IAM_V1 API did. https://docs.aws.amazon.com/redshift/latest/APIReference/API_GetClusterCredentialsWithIAM.html
Ah, thanks @Brooke-white! Confirmed that adding group_federation=True
fixes the issue.
I would never have guessed to try that based on the description of that parameter:
group_federation (bool): Use the IdP Groups in the Redshift. Default value False
What do you think of my suggestion of defaulting to GetClusterCredentialsWithIAM if iam=True and dbuser is None
?
Thanks for that feedback, @graydenshand . I will make a note to improve our docs to give more clear instructions on how users can enable use of the IAM_V2 API.
What do you think of my suggestion of defaulting to GetClusterCredentialsWithIAM if iam=True and dbuser is None?
I believe we default to the V1 API for backwards compatibility reasons, but I will verify with the team and report back.
Apologies for the late response. I've confirmed with the team that this behavior is in place due to backwards compatibility reasons. As of now, we do not have plans to change the implementation.
Driver version
2.0.913
Redshift version
PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.54052
Client Operating System
MacOs
Python version
3.10.6
Table schema
N/A
Problem description
awsdatacatalog
with an IAM role (but cannot)Python Driver trace logs
Reproduction code
I created a role in redshift, granted it access to awsdatacatalog, and am running the following:
gives: