Closed setu4993 closed 11 months ago
This is a feature I'm looking for as well.
I waited for this for a while, and reached out to our AWS reps multiple times over the years, but it was clear that this wasn't a priority for the SageMaker team.
A couple quarters ago, we implemented a workaround for this that is working nicely for us. We have a tiny wrapper on top of the SageMaker Python SDK to package models, that:
requirements.txt
file.<package_directory>
).PIP_FIND_LINKS
environment variable to the folder in which it'd be available within the container (/opt/ml/model/code/<package_directory>
..deploy(...)
method normally.We do this all via CI, but it's doable even outside, with a tiny step before invoking the SageMaker Python SDK.
Hope that helps.
This is interesting, I wonder though, why haven't you simply - using the code snippets above - in ur entry point script
Mostly because we didn't want to maintain our own copy of either a forked version of this package or repackaged Docker base images.
That adds a bunch of overhead for us across various package families and versions.
Describe the feature you'd like
We'd like the ability to install internal Python packages via CodeArtifact instead of just PyPI.
How would this feature be used? Please describe.
To install Internal Python packages that cannot be published publicly to PyPI, in SageMaker serving instances. Adding support for CodeArtifact would integrate it better with other AWS services.
CodeArtifact provides a 12-hour token, so if we create credentials and pass them on during model package creation, it'd likely expire before the endpoint was refreshed in the future or a new batch transform job is run >12 hours after the model package creation.
(This applies more to inference jobs like endpoints and batch transforms because dependencies get installed at run-time, not build time.)
This is not as much of a concern for SageMaker Training Jobs since we can pass credentials and jobs start up almost immediately (probably an issue with spot instance jobs which have a >12 hour wait time, though). But our use case is specifically for inference related services.
A solution could be to add the AWS CodeArtifact login step with something like the below before
_install_requirements:
here:And add before line 79:
Describe alternatives you've considered
Currently, private packages can either be served via an external service like Artifactory / Gemfury (by adding
--extra-index-url <URL>
torequirements.txt
, or by relative imports and dependency injection during packaging.Another alternative we've considered is forking the repo and adding the above mentioned changes in a private fork and using that for our SageMaker model deploys.
Additional context
I'm happy to take a stab at implementing this if there's interest.