Open jaketothepast opened 2 months ago
If this is actually a problem, and not just my own ignorance, I'm happy to contribute.
I'm experiencing the same issue using XGBOOST, and sagemaker[local]==2.222.0
.
A code snippet extracted from this example: https://github.com/aws-samples/amazon-sagemaker-local-mode/blob/main/xgboost_script_mode_local_training_and_serving/xgboost_script_mode_local_training_and_serving.py
from sagemaker import TrainingInput
from sagemaker.xgboost import XGBoost, XGBoostModel
from sagemaker.local import LocalSession
DUMMY_IAM_ROLE = 'arn:aws:iam::111111111111:role/service-role/AmazonSageMaker-ExecutionRole-20200101T000001'
LOCAL_SESSION = LocalSession()
LOCAL_SESSION.config = {'local': {'local_code': True}} # Ensure full code locality, see: https://sagemaker.readthedocs.io/en/stable/overview.html#local-mode
FRAMEWORK_VERSION = "1.7-1"
def main():
xgb_inference_model = XGBoostModel(
model_data="./tests/resources/models/fake/model.json",
role=DUMMY_IAM_ROLE,
entry_point="inference.py",
source_dir="./src",
framework_version=FRAMEWORK_VERSION,
sagemaker_session=LOCAL_SESSION
)
print('Deploying endpoint in local mode')
predictor = xgb_inference_model.deploy(
initial_instance_count=1,
instance_type="local",
)
def test_inference_endpoint():
main()
Expected behaviour: Model repacked locally. The docs imply you can " keep everything local, and not use Amazon S3 either, you can enable “local code”"
Actual behaviour: Failure when creating an S3 bucket during the deploy job.
The error:
operation_name = 'CreateBucket'
api_params = {'Bucket': 'sagemaker-eu-west-1-xxxxxxxxxxxx', 'CreateBucketConfiguration': {'LocationConstraint': 'eu-west-1'}}
My next steps: I'll give it the details of a bucket I have secure access to, and see what it actually tries to do with said bucket.
Describe the bug
When deploying a HuggingFace model with model data on disk, sagemaker SDK still tries to access the AWS API to determine the Sagemaker default bucket. I don't currently have access to my AWS credentials, so this is failing with access denied errors. I would like to be able to run locally without needing S3 access, and this is blocking me from doing so.
To reproduce
In an environment without AWS Access (no credentials setup).
Maybe I still need access to AWS via the HF inference toolkit pulling the correct ECR image? However, that doesn't seem to be where it's failing.
Expected behavior Local mode to deploy an endpoint into a Docker container.
System information A description of your system. Please provide:
[tool.poetry.dependencies] python = "^3.12" sagemaker = {extras = ["local"], version = "^2.217.0"} jupyterlab = "^4.1.8" setuptools = "^69.5.1"
Additional context
It looks like regardless of the local code setting in
~/.sagemaker/config.yaml
, the SDK will still try to access the S3 API to determine the default bucket name for sagemaker.From https://github.com/aws/sagemaker-python-sdk/blob/b17d332a5e4542d57d2039d08b124edc6042f9fb/src/sagemaker/model.py#L694.
Within
determine_bucket_and_prefix
,sagemaker_session.default_bucket()
is called. In my environment, this still fails even with aLocalSession
object.Why is a LocalSession still trying to access S3?