This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.61k
stars
2.82k
forks
source link
[engsys] Batch Sanitizers have the potential to mangle JSON responses and make them invalid #35395
35196 introduced a collection of "global" sanitizers that scrub secrets from recordings as they are written to disk.
The sanitization is Regex based and not "file format aware". Some of the sanitizers as defined have the potential to leave responses in a corrupt state (e.g. as invalid json).
It "eats" the trailing double quote (") that terminates the JSON string, which leaves the response as invalid JSON
It "eats" the leading ampersand (&), which corrupts the preceding query parameter.
except ValueError as err:
> raise DecodeError(
message="JSON is invalid: {}".format(err),
response=response,
error=err,
E azure.core.exceptions.DecodeError: JSON is invalid: Invalid control character at: line 3 column 78 (char 103)
E Content: {
E "secretsType": "Sas",
E "sasToken": "sv=2021-10-04&si=azureml-system-datastore-policy&sr=cSanitized
E }
../../../../venv/lib/python3.8/site-packages/azure/core/pipeline/policies/_universal.py:616: DecodeError
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
ML team has a regex for query parameters that might be helpful in finding a fix
Describe the bug
35196 introduced a collection of "global" sanitizers that scrub secrets from recordings as they are written to disk.
The sanitization is Regex based and not "file format aware". Some of the sanitizers as defined have the potential to leave responses in a corrupt state (e.g. as invalid json).
The sanitizers for query strings were the source of my issue, but this might be an issue with other sanitizers too: https://github.com/Azure/azure-sdk-for-python/blob/434a248f351284ac839a25f6ba582cbbb999b45a/tools/azure-sdk-tools/devtools_testutils/proxy_startup.py#L394-L400
To Reproduce Steps to reproduce the behavior:
&sig=
query param)Observe that your test passes when run in live mode (with the following environment variable set
AZURE_TEST_RUN_LIVE=true
).Try to replay the test in recording mode (with the following environment variable set
AZURE_TEST_RUN_LIVE=false
).Expected behavior
The test should pass, using the sanitized responses from the record.
Actual Behavior
This sanitizer mangles the JSON response:
https://github.com/Azure/azure-sdk-for-python/blob/434a248f351284ac839a25f6ba582cbbb999b45a/tools/azure-sdk-tools/devtools_testutils/proxy_startup.py#L397
"
) that terminates the JSON string, which leaves the response as invalid JSON&
), which corrupts the preceding query parameter.Screenshots If applicable, add screenshots to help explain your problem.
Additional context
ML team has a regex for query parameters that might be helpful in finding a fix
https://github.com/Azure/azure-sdk-for-python/blob/434a248f351284ac839a25f6ba582cbbb999b45a/sdk/ml/azure-ai-ml/tests/conftest.py#L72-L87