Closed jmahlik closed 3 years ago
Credentials should be masked in the logs to prevent a possible leak. This could be accomplished with a regex replace of the yaml content. I'm happy to submit a PR for this, but wanted to get thoughts because there may be better alternatives to regex. I'm not a fan of regex but it was the best I could come up with since the container needs the creds.
Thanks for the suggestion! Masking the credentials definitely makes sense. It may be possible to use PyYAML for this purpose instead of a regex-- I'd recommend looking into this. Please feel free to submit a PR and we'll be happy to review.
Credentials should be masked in the logs to prevent a possible leak. This could be accomplished with a regex replace of the yaml content. I'm happy to submit a PR for this, but wanted to get thoughts because there may be better alternatives to regex. I'm not a fan of regex but it was the best I could come up with since the container needs the creds.
Thanks for the suggestion! Masking the credentials definitely makes sense. It may be possible to use PyYAML for this purpose instead of a regex-- I'd recommend looking into this. Please feel free to submit a PR and we'll be happy to review.
Excellent suggestion, that is way better. I'll take a stab at it over the weekend.
Think I have something working. The environment turned out to be a list of strings so the simple solution was to set each item in the "environment" list to "[Masked]" for all services (on a separate deepcopy for logging).
One question: where is logging configured for local mode/test suite? I can't seem to track it down. I'm only getting WARNING level logging when running the test suite. When I run jobs in local mode, it returns INFO level logging. Makes using caplog to check the INFO logs a bit hard in the tests.
Any thoughts appreciated. Will submit PR once I can get assert on the info logs.
Describe the feature you'd like Currently, line 673 in
sagemaker/local/image.py
logs AWS creds in the docker compose file. This could cause a credentials leak if it is running in a ci pipeline, during presentations/demos and for live streamers or youtubers creating sagemaker content.I don't feel this is a outright security issue, just unexpected logging that could have adverse impact in certain situations.
One has to explicitly provide fake creds and local data to get around this. This limits the ability to use data from s3 and use other aws services in jobs.
How would this feature be used? Please describe. Credentials should be masked in the logs to prevent a possible leak. This could be accomplished with a regex replace of the yaml content. I'm happy to submit a PR for this, but wanted to get thoughts because there may be better alternatives to regex. I'm not a fan of regex but it was the best I could come up with since the container needs the creds.
Describe alternatives you've considered Currently patching out this logging in a ci pipeline.