Closed mnschmit closed 1 year ago
@mnschmit Thanks for surfacing this issue and the detailed analysis. I'm looking into this now and will try the fix you suggested. Will keep you posted.
Great, thank you, @fa9r. It would be great if there was a direct fix for all AWS-related tagging in any (possibly future) components.
Nice your proposed fix worked like a charm. I included it in #1799 and will make sure it gets merged before the next ZenML release.
IMO our current tag representation (simple dict) is much more intuitive than the representation that sagemaker expect. The tags are also not used in any other place, so I think it is fine to only adjust the format right before submitting the pipeline. The sagemaker step operator doesn't even support tags atm, so nothing to be done there. Thus, your fix was exactly what needed to be changed, thanks again for the hint :)
Cool, thanks for the update :+1: :slightly_smiling_face:
Contact Details [Optional]
martin.schmitt@celebrate.company
System Information
What happened?
I specified tags in my Sagemaker orchestrator, i.e., key-value pairs we can use in AWS to describe resources, e.g., for billing purposes. Once I did that, pipeline runs were failing with the following error:
(squad is one of my custom tags; see config of Sagemaker orchestrator above)
I figured out that the underlying
sagemaker.processing.Processor
expects itstags
parameter (list[dict[str,str]]
) in the following format:tags=[{"Key": "Project", "Value": "MyProject"}, {"Key": "Environment", "Value": "Development"}]
, i.e., with a dedicateddict
per tag with the fixed keysKey
(uppercase!) andValue
whereas the current ZenML implementation passes a list with a single dictionary containing all the tags like this{'key1': 'value1', 'key2': 'value2'}
.A quick fix for this would be to change this line to
That fixed it immediately for me. But I still open this issue instead of a PR because I am unsure whether the right format should already be stored before this line is executed, i.e., at some other place, and we should not only fix it at the time when the tags are submitted to create a pipeline with the Sagemaker orchestrator. I am not sure, but it is possible that a similar issue exists with the Sagemaker step operator. So it might be favorable to already store it in the ZenML server in the format that is expected by the Sagemaker API.
Reproduction steps
Relevant log output
No response
Code of Conduct