aws / sagemaker-training-toolkit

Train machine learning models within a 🐳 Docker container using 🧠 Amazon SageMaker.
Apache License 2.0
496 stars 118 forks source link

fix: handle utf-8 decoding exceptions while processing std streams #136

Closed vishwakaria closed 2 years ago

vishwakaria commented 2 years ago

Issue #, if available: During training, if the user script tries to write non utf-8 characters to Cloudwatch, an error is thrown because decoding fails and is not handled anywhere. This causes training to fail.

Description of changes: Adding a parameter to replace unhandled characters with a ? during utf-8 decoding. use � (U+FFFD, the official REPLACEMENT CHARACTER) for replacing. Refer: https://docs.python.org/3.9/library/codecs.html#error-handlers

Testing done:

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

Tests

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository