Open mouhannadali opened 2 years ago
I have the same issue when registering the model using Sagemaker Pipeline
any update here?
Environment: SageMaker Studio Framework: SageMaker LinearLearner Framework Version: latest as of 2/22/2023 Python Version: 3.7.10 CPU or GPU: CPU Python SDK Version: 2.131.0 Are you using a custom image: No
I'm seeing the same thing. To create the model I'm using some generic code
estimator = Estimator(
image_uri=image_uri,
role=role,
output_path=output_path,
sagemaker_session=sagemaker_session,
instance_type=instance_type,
instance_count=instance_count,
enable_sagemaker_metrics=True,
volume_kms_key=use_case_kms_key,
output_kms_key=use_case_kms_key,
subnets=subnets,
security_group_ids=security_group,
enable_network_isolation=enable_network_isolation,
encrypt_inter_container_traffic=encrypt_inter_container_traffic,
tags=tags
)
estimator.set_hyperparameters(
epochs=epochs,
l1=l1,
learning_rate=learning_rate,
predictor_type=predictor_type
)
Experiment.load(experiment_name=experiment_name)
linear_trial = Trial.create(
trial_name=trial_name,
experiment_name=experiment_name,
sagemaker_boto_client=sm_client,
tags=tags
)
estimator.fit(
inputs={
'train': train_input,
'validation':validation_input,
'test':test_input
},
job_name = base_job_name+'-'+mlops_id,
experiment_config={
'TrialName': linear_trial.trial_name,
'TrialComponentDisplayName': 'training',
},
wait=True,
logs=False,
)
Then the path to the model is saved:
model_uri = f'{output_path}/{estimator.latest_training_job.job_name}/output/model.tar.gz'
Then to register this model I run this code:
response = sm_client.create_model_package(
ModelPackageGroupName=model_package_group_name,
ModelPackageDescription='Model registration testing',
ModelApprovalStatus='PendingManualApproval',
InferenceSpecification={
'Containers': [
{
'Image': image_uri,
'ModelDataUrl': model_uri,
'NearestModelName': model_name
},
],
'SupportedTransformInstanceTypes': [inference_instance_type],
'SupportedContentTypes': ['text/csv'],
'SupportedResponseMIMETypes': ['text/csv']
},
CustomerMetadataProperties={
'train': training_path,
'validation':validation_path,
'test':testing_path,
'experiment_name':experiment_name
},
)
The model successfully uploads to the registry, increments its version, but I get the same error as others.
Is this a bug or are we misusing the Model Registry?
After playing around quite a bit I found that registering the model through boto3 like I did in my comment above did not automatically link the TrialComponent, but when using the SageMaker SDK way of registering a model I was able to see the TrialComponent link
model_package = linear_learner.register(
model_package_group_name=model_package_group_name,
model_name='linear-learner',
image_uri=image_uri,
transform_instances=[instance_type],
content_types=['text/csv'],
response_types=['text/csv'],
approval_status='PendingManualApproval',
customer_metadata_properties={
'train': training_path,
'validation':validation_path,
'test':testing_path,
'experiment_name':experiment_name
}
)
Describe the bug After a model is trained and registered, I navigate to the model registry and select the model group name -> model version -> settings. At "Trial Component" row is shows "Failed to retrieve model package details" This issue is appearing just for the approved model versions To Reproduce Steps to reproduce the behavior: trained and registered a model then approve the model
Expected behavior A link to the corresponding "Trail component" should be shown
Screenshots If applicable, add screenshots to help explain your problem.
Environment: Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): Framework Version: Python Version: CPU or GPU: Python SDK Version: Are you using a custom image:
Additional context Add any other context about the problem here.