boto / boto3

AWS SDK for Python
https://aws.amazon.com/sdk-for-python/
Apache License 2.0
9.04k stars 1.87k forks source link

start_data_quality_ruleset_evaluation_run in Glue provide runID #3733

Closed shivam-patil-DAX closed 1 year ago

shivam-patil-DAX commented 1 year ago

Describe the issue

start_data_quality_ruleset_evaluation_run in glue provide runID and as there is no field in start_data_quality_ruleset_evaluation_run for jobName the job doesn't get executed and we get just the runID Even for start_job API we need jobName so how can we proceed further with start_data_quality_ruleset_evaluation_run api

Links

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue/client/start_data_quality_ruleset_evaluation_run.html

RyanFitzSimmonsAK commented 1 year ago

Hi @shivam-patil-DAX, thanks for reaching out. I'm a bit confused by what you mean when you say the job doesn't get executed. If you want to look at the results of start_data_quality_ruleset_evaluation_run, using get_data_quality_ruleset_evaluation_run requires the RunId, and returns the details you might be looking for. Let me know if you have any follow-up questions. Thanks!

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue/client/get_data_quality_ruleset_evaluation_run.html#get-data-quality-ruleset-evaluation-run

shivam-patil-DAX commented 1 year ago

@RyanFitzSimmonsAK The thing is while running get_data_quality_ruleset_evaluation_run getting this error in response 'ErrorString': 'LAUNCH ERROR | Error downloading from S3 for bucket: aws-glue-ml-data-quality-assets-us-east-1, key: jars/aws-glue-ml-data-quality-etl.jar.Access Denied (Service: Amazon S3; Status Code: 403; Please refer logs for details.', but in the entire thing, I am not creating this bucket aws-glue-ml-data-quality-assets-us-east-1 nor the bucket has been generated before

RyanFitzSimmonsAK commented 1 year ago

Thanks for following up. Could you provide debug logs of this behavior? You can get debug logs by adding boto3.set_stream_logger('') to the top of your script, and redacting any sensitive information. Thanks!

shivam-patil-DAX commented 1 year ago

`2023-06-03 08:39:38,520 botocore.hooks [DEBUG] Event before-parameter-build.glue.GetDataQualityRulesetEvaluationRun: calling handler <function generate_idempotent_uuid at 0x0000018109D54700> 2023-06-03 08:39:38,520 botocore.regions [DEBUG] Calling endpoint provider with parameters: {'Region': 'us-east-1', 'UseDualStack': False, 'UseFIPS': False} 2023-06-03 08:39:38,521 botocore.regions [DEBUG] Endpoint provider result: https://glue.us-east-1.amazonaws.com 2023-06-03 08:39:38,521 botocore.hooks [DEBUG] Event before-call.glue.GetDataQualityRulesetEvaluationRun: calling handler <function add_recursion_detection_header at 0x0000018109D543A0>
2023-06-03 08:39:38,521 botocore.hooks [DEBUG] Event before-call.glue.GetDataQualityRulesetEvaluationRun: calling handler <function inject_api_version_header_if_needed at 0x0000018109D57F70> 2023-06-03 08:39:38,522 botocore.endpoint [DEBUG] Making request for OperationModel(name=GetDataQualityRulesetEvaluationRun) with params: {'url_path': '/', 'query_string': '', 'method': 'POST', 'headers': {'X-Amz-Target': 'AWSGlue.GetDataQualityRulesetEvaluationRun', 'Content-Type': 'application/x-amz-json-1.1', 'User-Agent': 'Boto3/1.26.142 Python/3.9.13 Windows/10 Botocore/1.29.142'}, 'body': b'{"RunId": "dqrun-8176dbb17ee64a400edea237bcf7066u02d77860"}', 'url': 'https://glue.us-east-1.amazonaws.com/', 'context': {'client_region': 'us-east-1', 'client_config': <botocore.config.Config object at 0x000001811B145A60>, 'has_streaming_input': False, 'auth_type': None}} 2023-06-03 08:39:38,522 botocore.hooks [DEBUG] Event request-created.glue.GetDataQualityRulesetEvaluationRun: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x000001811B1458B0>> 2023-06-03 08:39:38,523 botocore.hooks [DEBUG] Event choose-signer.glue.GetDataQualityRulesetEvaluationRun: calling handler <function set_operation_specific_signer at 0x0000018109D545E0> 2023-06-03 08:39:38,524 botocore.auth [DEBUG] Calculating signature using v4 auth. 2023-06-03 08:39:38,524 botocore.auth [DEBUG] CanonicalRequest:

{ "AdditionalDataSources": {}, "AdditionalRunOptions": { "CloudWatchMetricsEnabled": true, "PublishCloudwatchMetrics": true }, "DataSource": { "GlueTable": { "__type": "", "DatabaseName": "", "TableName": "" } }, "ErrorString": "LAUNCH ERROR | Error downloading from S3 for bucket: aws-glue-ml-data-quality-assets-us-east-1, key: jars/aws-glue-ml-data-quality-etl.jar.Access Denied (Service: Amazon S3; Status Code: 403; Please refer logs for details.", "ExecutionTime": 15, "LastModifiedOn": 1.685444336136E9, "NumberOfWorkers": 2, "Role": "", "RulesetNames": ["test_hgf"], "RunId": "", "StartedOn": 1.685444304739E9, "Status": "FAILED" }`

This has been the output of the logger

shivam-patil-DAX commented 1 year ago

Hey, @RyanFitzSimmonsAK Do you have any idea about this error?!!!!

RyanFitzSimmonsAK commented 1 year ago

Hi @shivam-patil-DAX; the Access Denied exception indicates it might an issue with permissions. Could you check that the role you are using has all the necessary permissions in both Glue and S3?

github-actions[bot] commented 1 year ago

Greetings! It looks like this issue hasn’t been active in longer than five days. We encourage you to check if this is still an issue in the latest release. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or upvote with a reaction on the initial post to prevent automatic closure. If the issue is already closed, please feel free to open a new one.