Closed dlqqq closed 9 months ago
The *.env.out files allow the environment to be reproduced without using docker, as stated in the README.
Amazon SageMaker Distribution supports full reproducibility of Conda environments, so you don't necessarily need to use Docker. Just find the version number you want to use in the build_artifacts directory, open one of cpu.env.out or gpu.env.out and follow the instructions in the first 2 lines.
SageMaker Studio Lab currently consumes those files for this purpose. Other SageMaker services may follow this route, such as Notebook Instances since its environment is not container based.
@dylanraws Thanks! This explains the current precedent of only updating the env.out
files on release. Then this issue is still valid, since we only want to commit changes manually & explicitly immediately prior to release.
Verify that
*.env.out
files can change on rebuild even when no new packages are added to*.env.in
, and verify that they're not being used for any purpose other than to track what packages went into a release
This is true.
IMO the *.env.out
files should be generated once as the last step of the image build process. Now, if they are generated before or not - that shouldn't be a problem as long as they are reliably overridden at the end.
So, I am not sure why we need such a change at all.
An alternative could be we stop generating env.out
files by default. Plus, we add a new parameter to the build
command, that explicitly generates this file, and only the final build step is the one that leverages this parameter.
env.out doesn't change if you re-build. It might change only if you use the --force flag.
https://github.com/aws/sagemaker-distribution/pull/57.
When we do our releases , we don't use the --force flag by default.
To my knowledge,
*.env.out
files can change on rebuild even if no code changes were made.With this, I propose that our next steps are to:
*.env.out
files can change on rebuild even when no new packages are added to*.env.in
, and verify that they're not being used for any purpose other than to track what packages went into a release*.env.out
to.gitignore