aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.36k stars 3.77k forks source link

CfnCrawler: Missing parameter LakeFormationConfiguration in CDK Construct but present in aws cli #29246

Open lorenzo-necto opened 4 months ago

lorenzo-necto commented 4 months ago

Describe the issue

Hello!

We have aws docs explaining how to create a Crawler with LakeFormation Credentials via Console or AWS CLI, but there is no way of doing it via CDK yet.

The AWS CLI command is aws glue create-crawler --cli-input-json '{ ... "LakeFormationConfiguration" : {} ...}'

here's the reference Docs

Can this param be added to CDK CfnCrawler?

ToDo

Add 'LakeFormationConfiguration' parameter for CfnCrawler

Links

https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_glue.CfnCrawler.html

pahud commented 4 months ago

This is what we have in the cloudformation spec for CfnCrawler and looks like there's no specific config for LakeFormationConfiguration. I assume this probably could be configured in the Configuration.

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html

lorenzo-necto commented 4 months ago

Hey! Thanks for the quick reply! I tried to include it in the Configuration param but no luck, it says it is not a valid key for that Configuration dictionary parameter.

Yes! I tried injecting it low level to the CF yml via code, but I also noticed that the CF docs don't have it.

The GUI does though, and the CLI command exists, for now I am doing it via GUI, but it would be great to understand if this will be supported via IaC, or if there is another way of being sure that it is enabled

This is where the CLI command reference is : https://docs.aws.amazon.com/glue/latest/dg/crawler-configuration.html#crawler-lf-integ

aws glue --profile demo create-crawler --debug --cli-input-json '{ "Name": "prod-test-crawler", "Role": "arn:aws:iam::111122223333:role/service-role/AWSGlueServiceRole-prod-test-run-role", "DatabaseName": "prod-run-db", "Description": "", "Targets": { "S3Targets":[ { "Path": "s3://crawl-testbucket" } ] }, "SchemaChangePolicy": { "UpdateBehavior": "LOG", "DeleteBehavior": "LOG" }, "RecrawlPolicy": { "RecrawlBehavior": "CRAWL_EVERYTHING" }, "LineageConfiguration": { "CrawlerLineageSettings": "DISABLE" }, "LakeFormationConfiguration": { "UseLakeFormationCredentials": true, "AccountId": "111122223333" }, "Configuration": { "Version": 1.0, "CrawlerOutput": { "Partitions": { "AddOrUpdateBehavior": "InheritFromTable" }, "Tables": {"AddOrUpdateBehavior": "MergeNewColumns" } }, "Grouping": { "TableGroupingPolicy": "CombineCompatibleSchemas" } }, "CrawlerSecurityConfiguration": "", "Tags": { "KeyName": "" } }'

It might be that they are quite new, there is also a LineageConfiguration param in this example, so those might be in queue to be added and still experimental, not sure