spe-uob / 2020-HealthcareLake

A reasonably secure data lake for healthcare analytics
MIT License
9 stars 5 forks source link

Services name conflicts while deploying multiple datalakes #138

Closed vladbucur2000 closed 3 years ago

vladbucur2000 commented 3 years ago

Services are not able to start when running multiple datalakes. The issue is correlated with possibly hardcoded names. Below are some errors which describe what services are encountering this problem.

Show logs Error: Error creating IAM Role AWSGlueServiceRole-CrawlerRole: EntityAlreadyExists: Role with name AWSGlueServiceRole-CrawlerRole already exists. │ Error: Error creating IAM Role AWSGlueServiceRole-JobRole: EntityAlreadyExists: Role with name AWSGlueServiceRole-JobRole already exists. | Error: Error creating IAM Role AWSGlueServiceRole-LakeCrawlerRole: EntityAlreadyExists: Role with name AWSGlueServiceRole-LakeCrawlerRole already exists. │ Error: error creating IAM policy AWSGlueServiceRole-Crawler: EntityAlreadyExists: A policy called AWSGlueServiceRole-Crawler already exists. Duplicate names are not allowed. │ Error: error creating IAM policy AWSGlueServiceRole-Job: EntityAlreadyExists: A policy called AWSGlueServiceRole-Job already exists. Duplicate names are not allowed. │ Error: error creating IAM policy FHIR_DynamoDb_Access: EntityAlreadyExists: A policy called FHIR_DynamoDb_Access already exists. Duplicate names are not allowed. │ Error: error creating IAM policy Lake_S3_Read: EntityAlreadyExists: A policy called Lake_S3_Read already exists. Duplicate names are not allowed. │ Error: error creating IAM policy PySpark_Lib_S3_Access: EntityAlreadyExists: A policy called PySpark_Lib_S3_Access already exists. Duplicate names are not allowed. │ Error: Error creating Catalog Database: AlreadyExistsException: Database already exists. │ Error: Creating CloudWatch Log Group failed: OperationAbortedException: A conflicting operation is currently in progress against this resource. Please try again. 'FhirDb-glue-logs' │ Error: error creating Glue Trigger (FhirDbIngestion): AlreadyExistsException: Workflow with name 'FhirDbIngestion' already Exists │ Error: Error creating Catalog Database: AlreadyExistsException: Database already exists.

Having a single datalake in the map of data_lakes works well, the problem is encountered when having more than 1.