data-dot-all / dataall

A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
https://data-dot-all.github.io/dataall/
Apache License 2.0
228 stars 82 forks source link

Create and import data set stack failure #423

Closed mvidhu closed 1 year ago

mvidhu commented 1 year ago

Describe the bug

Data set creation and import stack is failing in cloudformation and is always getting rolled back from past few days. This issue started recently and there is no change in the dataall version on our organization from past 5 months. Hence we are not able to find the root cause of the issue. Error in cloudformation stack occurs at the creation of crawler and error message is S3 bucket dataall-<> does not exist. (Service: AWSGlue; Status Code: 400; Error Code: InvalidInputException; Request ID: dedb0351-c13a-4f6e-b84d-dea3cf0db051; Proxy: null) We are using cdk version 14

Error in cloudformation while importing dataset: Failure in creation of dataallDatasetDatabase

CREATE_FAILED Received response status [FAILED] from custom resource. Message returned: Error: Could not create Glue Database dataall_<> in aws://<>/<>, received An error occurred (AccessDeniedException) when calling the CreateDatabase operation: Insufficient Lake Formation permission(s) on s3://<>/ Logs: /aws/lambda/dataall-gluedb-handler-m6up1tqu at invokeUserFunction (/var/task/framework.js:2:6) at processTicksAndRejections (internal/process/task_queues.js:97:5) at async onEvent (/var/task/framework.js:1:302) at async Runtime.handler (/var/task/cfn-response.js:1:1474) (RequestId: 58ef82a8-80a8-41f3-b633-28c71896598c)

How to Reproduce

Bootstrap a aws account as environment in data.all Create a data set in the environment. Stack creation is failing and is in ROLLBACK_COMPLETE state. Import existing bucket in the environment. Stack creation is in ROLLBACK_FAILED state

Expected behavior

Create and import should be successful.

Your project

No response

Screenshots

No response

OS

Mac

Python version

3.11

AWS data.all version

0.5.0

Additional context

No response

dlpzx commented 1 year ago

Hi @mvidhu :) Thanks for opening the issue. If I understand correctly you actually have 2 issues:

I hope this helps, please comment here if you still face issues :)

dlpzx commented 1 year ago

Closing due to inactivity. Re-open if needed.