iann0036 / AWSConsoleRecorder

Records actions made in the AWS Management Console and outputs the equivalent CLI/SDK commands and CloudFormation/Terraform templates.
MIT License
1.42k stars 87 forks source link

Create Glue Crawler is missed #35

Open alytle opened 5 years ago

alytle commented 5 years ago

Describe the bug When creating a Glue Crawler from the console, the call to create the Crawler itself is missed.

Related Mapping glue.CreateCrawler

Related Language n/a

To Reproduce

  1. Go to https://console.aws.amazon.com/glue/home?region=us-east-1#catalog:tab=crawlers
  2. Click Add Crawler
  3. Fill out required information
  4. Click Add Crawler

Expected behavior Glue Crawler would be created in the resulting code. Currently the secondary items are created successfully (Glue Connection, Glue Database) when using the same wizard, but not the Crawler itself.

Screenshots n/a

Additional context n/a

iann0036 commented 5 years ago

Hi Andrew,

Thanks for raising. I tried to reproduce but was only able to successfully create the crawler resource. Could you attempt to reproduce and check both the main.html and bg.js console logs to see if there is any obvious issues there?

Cheers, Ian.

alytle commented 5 years ago

OK, I tried again today. I was able to narrow down the problem slightly. When I create a Glue Crawler with an S3 input source, it seems to work, but if I create one which has a JDBC datastore, it doesn't get captured.

Here is the bg.js logs from the successful S3 crawler creation:

{"actionResponses":[{"action":"com.amazonaws.console.glue.shared.UserPreferenceRequestContext.getUserPreference"}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.getSecurityConfigurations","data":{"securityConfigurations":[],"nextToken":""}}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AmazonS3Context.listBuckets","data":<redacted>]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.getConnections","data":{"connectionList":[{"name":"test","description":"","connectionType":"JDBC","connectionProperties":{"JDBC_ENFORCE_SSL":"false","JDBC_CONNECTION_URL":"jdbc:postgres://something:5432/databasename","USERNAME":"username"},"physicalConnectionRequirements":{"subnetId":"subnet-<redacted>","securityGroupIdList":["sg-<redacted>"],"availabilityZone":"us-east-1b"},"creationTime":1554130760555,"lastUpdatedTime":1554130760555}]}}]}  bg.js:5398:13
{"actionResponses":[{"action":"com.amazonaws.console.glue.shared.IAMRequestContext.listRoles","data":[{"roleName":"AWSGlueServiceRole-Glue","roleId":"<redacted>","arn":"arn:aws:iam::<redacted>:role/service-role/AWSGlueServiceRole-Glue"}]}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.getCrawler","error":{"message":"{\"service\":\"AWSGlue\",\"statusCode\":400,\"errorCode\":\"EntityNotFoundException\",\"requestId\":\"e7f01c7f-548f-11e9-964b-05ff9084c187\",\"errorMessage\":\"Crawler entry with name s3-crawler does not exist\",\"type\":\"AwsServiceError\"}","code":400}}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AmazonS3Context.listBuckets","data":[<redacted>]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.getDatabases","data":{"databaseList":[{"name":"default","description":"Default Hive database","locationUri":"hdfs://ip-172-20-79-123.ec2.internal:8020/user/hive/warehouse","createTime":1528479163000},{"name":"sampledb","description":"Sample database","parameters":{"CreatedBy":"Athena","EXTERNAL":"TRUE"},"createTime":1528479057000}]}}]}  bg.js:5398:13

Calling notify  bg.js:2022:5

Type error for parameter options (Property "buttons" is unsupported by Firefox) for notifications.create.  bg.js:2023

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.createCrawler","data":{}}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.listCrawlers","data":{"crawlerNames":["s3-crawler"]}}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.getTaggedResources","data":{"paginationToken":"","resourceTagMappingList":[]}}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.batchGetCrawlers","data":{"crawlers":[{"name":"s3-crawler","role":"service-role/AWSGlueServiceRole-Glue","targets":{"s3Targets":[{"path":"s3://andlytle-test/","exclusions":[]}],"jdbcTargets":[],"dynamoDBTargets":[]},"databaseName":"default","classifiers":[],"schemaChangePolicy":{"updateBehavior":"UPDATE_IN_DATABASE","deleteBehavior":"DEPRECATE_IN_DATABASE"},"state":"READY","crawlElapsedTime":0,"creationTime":1554131284000,"lastUpdated":1554131284000,"version":1}],"crawlersNotFound":[]}}]}  bg.js:5398:13
{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.getCrawlerMetrics","data":{"crawlerMetricsList":[{"crawlerName":"s3-crawler","timeLeftSeconds":0.0,"stillEstimating":false,"lastRuntimeSeconds":0.0,"medianRuntimeSeconds":0.0,"tablesCreated":0,"tablesUpdated":0,"tablesDeleted":0}]}}]}  bg.js:5398:13

and here are the logs from the unsuccessful JDBC creation:

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.getCrawler","error":{"message":"{\"service\":\"AWSGlue\",\"statusCode\":400,\"errorCode\":\"EntityNotFoundException\",\"requestId\":\"8b011b01-5490-11e9-bb13-8bbe5f181416\",\"errorMessage\":\"Crawler entry with name jdbc-crawler does not exist\",\"type\":\"AwsServiceError\"}","code":400}}]}  bg.js:5398:13
{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AmazonS3Context.listBuckets","data":[<redacted>]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.getDatabases","data":{"databaseList":[{"name":"default","description":"Default Hive database","locationUri":"hdfs://ip-172-20-79-123.ec2.internal:8020/user/hive/warehouse","createTime":1528479163000},{"name":"sampledb","description":"Sample database","parameters":{"CreatedBy":"Athena","EXTERNAL":"TRUE"},"createTime":1528479057000}]}}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.createCrawler","data":{}}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.listCrawlers","data":{"crawlerNames":["jdbc-crawler","s3-crawler"]}}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.getTaggedResources","data":{"paginationToken":"","resourceTagMappingList":[]}}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.batchGetCrawlers","data":{"cra
wlers":[{"name":"s3-crawler","role":"service-role/AWSGlueServiceRole-Glue","targets":{"s3Targets":[{"path":"s3://andlytle-test/","exclusions":[]}],"jdbcTargets":[],"dynamoDBTargets":[]},"databaseName":"default","classifiers":[],"schemaChangePolicy":{"updateBehavior":"UPDATE_IN_DATABASE","deleteBehavior":"DEPRECATE_IN_DATABASE"},"state":"READY","crawlElapsedTime":0,"creationTime":1554131284000,"lastUpdated":1554131284000,"version":1},{"name":"jdbc-crawler","role":"service-role/AWSGlueServiceRole-Glue","targets":{"s3Targets":[],"jdbcTargets":[{"connectionName":"test","path":"%","exclusions":[]}],"dynamoDBTargets":[]},"databaseName":"default","classifiers":[],"schemaChangePolicy":{"updateBehavior":"UPDATE_IN_DATABASE","deleteBehavior":"DEPRECATE_IN_DATABASE"},"state":"READY","crawlElapsedTime":0,"creationTime":1554131558000,"lastUpdated":1554131558000,"version":1}],"crawlersNotFound":[]}}]}  bg.js:5398:13

{"actionResponses":[{"action":"com.amazonaws.console.glue.awssdk.shared.context.AWSGlueContext.getCrawlerMetrics","data":{"crawlerMetricsList":[{"crawlerName":"jdbc-crawler","timeLeftSeconds":0.0,"stillEstimating":false,"lastRuntimeSeconds":0.0,"medianRuntimeSeconds":0.0,"tablesCreated":0,"tablesUpdated":0,"tablesDeleted":0},{"crawlerName":"s3-crawler","timeLeftSeconds":0.0,"stillEstimating":false,"lastRuntimeSeconds":0.0,"medianRuntimeSeconds":0.0,"tablesCreated":0,"tablesUpdated":0,"tablesDeleted":0}]}}]}  bg.js:5398:13

Here's what I get as total output for my two crawlers:

# pip install boto3

import boto3

glue_client = boto3.client('glue', region_name='us-east-1')

response = glue_client.get_security_configurations()
response = glue_client.get_connections()

s3_client = boto3.client('s3', region_name='us-east-1')

response = s3_client.list_buckets()
response = s3_client.list_buckets()
response = glue_client.get_databases()
response = glue_client.create_crawler(
    Name='s3-crawler',
    Role='arn:aws:iam::<redacted>:role/service-role/AWSGlueServiceRole-Glue',
    DatabaseName='default',
    Classifiers=[],
    Schedule='',
    Configuration='{"Version":1}',
    TablePrefix='',
    SchemaChangePolicy={
        'UpdateBehavior': 'UPDATE_IN_DATABASE'
    },
    Targets={
        'S3Targets': [
            {
                'Path': 's3://andlytle-test/',
                'Exclusions': []
            }
        ],
        'JdbcTargets': [],
        'DynamoDBTargets': []
    }
)
response = glue_client.get_classifiers()
response = glue_client.get_classifiers()
response = glue_client.get_security_configurations()
response = glue_client.get_connections()
response = s3_client.list_buckets()
response = s3_client.list_buckets()
response = glue_client.get_databases()
response = glue_client.get_classifiers()
iann0036 commented 5 years ago

Hey Andrew,

Thanks for clarifying the difference between the S3 and JDBC Crawlers. It helped in tracing what I believe was the issue down to the use of a decodeURIComponent call within the initial data processing. This broke when we had a % symbol in the payload, which is likely in your Include Path for a JDBC connection.

I've released a new version, 0.3.24 which should fix the issue.