Open rubencg195 opened 1 year ago
Voting for Prioritization
Volunteering to Work on This Issue
Hi team, any updates about this, thanks.
Hi @rubencg195, as you can in AWS docs for sagemaker domain https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateDomain.html there is no option to reference existing domain. you can retain the implictly created filesystem using https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sagemaker_domain#retention_policy (not sure how data behaves but as above this is on AWS API side and provider doesnt explicitly delete anything)
But still, when one recreates the domain, it cannot use an existing EFS FS, this an AWS limitation
i think to fix this problem you need to add the following code to your terraform code:
retention_policy {
home_efs_file_system = "Retain"
}
The Default Value is not working at Terraform so you need to set it.
Terraform Core Version
0.13.7
AWS Provider Version
4.53.0
Affected Resource(s)
Whenever we do a networking update, like updating the security group rules, or other type, like updating the jupyter image ARN, to aws_sagemaker_domain and aws_sagemaker_user_profile, it recreates the domain, loses access to the previous EFS server, creates a new one and, loses access to the existing files in the previous EFS. The aws representatives mentioned they can't do anything about it and they recommended reporting the issue to Hashicorp, and in the meantime, use their guides to backup the EFS data to S3, and use EC2s to mount both EFS and move data from the old to the new one which is a lot of manual work and could easily be fixed by adding an option to the aws_sagemaker_domain and aws_sagemaker_user_profile to specify an existing EFS id instead of creating a new one.
Expected Behavior
The domain should keep a reference to the existing EFS server, not create a new one, and not loose reference to the files that appear on the SageMaker Studio's filesystem. Please, add an option to the aws_sagemaker_domain and aws_sagemaker_user_profile resources to specify an existing EFS id instead of creating a new one.
Actual Behavior
The domain is recreated, reference to the existing EFS with the files is lost, and the files in the SageMaker Studio filesystem are wiped.
Relevant Error/Panic Output Snippet
Terraform Configuration Files
Steps to Reproduce
A basic example of a change that triggers the recreation of the domain and user is updating the jupyter version or changing the security group rules.
Before
After
Debug Output
N/A. Files just do not appear of the SageMaker filesystem after an update.
Panic Output
N/A. Files just do not appear of the SageMaker filesystem after an update.
Important Factoids
Please, add an option to the aws_sagemaker_domain and aws_sagemaker_user_profile resources to specify an existing EFS id instead of creating a new one.
References
No response
Would you like to implement a fix?
None