Open pcgaustad opened 1 year ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This bug happens here as well for APIM 3.15.21
Created a API with URL Rewriting policy and a echo backend.
With URL Rewriting enabled:
printf '\x00\x00\x00\x00\xff\xff\xff\xff' | curl -H "Content-Type:application/octet-stream" --data-binary @- https://<URL_TO_API> --output a.bin
hexdump -C a.bin
00000000 00 00 00 00 ef bf bd ef bf bd ef bf bd ef bf bd |................|
00000010
With URL Rewriting disabled:
printf '\x00\x00\x00\x00\xff\xff\xff\xff' | curl -H "Content-Type:application/octet-stream" --data-binary @- https://<URL_TO_API> --output b.bin
hexdump -C b.bin
00000000 00 00 00 00 ff ff ff ff |........|
00000008
Hello @pcgaustad and @glorytiger, first I would like to apologize for this late answer.
The purpose of the URL rewriting policy is to replace URL strings in headers or in the body by an other URL. Since this replacement mechanism relies on regex, the policy assumes that the body content is a string and does not expect binary content.
If you just want to replace URLs in headers, you can unselect Rewrite HTTP response body
option in the policy configuration.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi @phiz71 and thank you for the response.
I see that we didn't mention that this applies also if the Rewrite HTTP response body
is turned off.
printf '\x00\x00\x00\x00\xff\xff\xff\xff' | curl -H "Content-Type:application/octet-stream" --data-binary @- https://gw-qa.intark.uh-it.no/cn-test/url-rewrite-binary --output a.bin && hexdump -C a.bin
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 24 0 16 100 8 92 46 --:--:-- --:--:-- --:--:-- 139
00000000 00 00 00 00 ef bf bd ef bf bd ef bf bd ef bf bd |................|
00000010
Hi, could you remind me the version of APIM you are using ? Have you tested with the latest versions available ? Latest 3.x is 3.20.32 Latest 4.x is 4.3.1
Hi,
I tested with version 3.20.32 and was unable to reproduce the issue with Rewrite HTTP response body
turned off. I also failed to reproduce it in the environment from yesterday (3.20.22).
This makes me believe that I might have done an error in my last test from yesterday; namely pass the curl command before the test API was updated. It seems it is working like you described in your previous answer.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Describe the bug
Binary data gets mangled when using a URL rewriting or assign content policy. This goes for whatever data the policy applies to, whether it is in the request or response. In this report I describe the bug for a response with a URL rewriting policy.
To Reproduce
Create a new API with a minimal plan and a backend that can respond with binary data. Do not apply any policies.
Send a request which will make the backend return some predefined binary data. In my case
\x00\x00\x00\x00\xff\xff\xff\xff
Save the binary data somehow, e.g. with
curl --output test.bin "https://example.com"
Compare the binary data in the response with the predefined data, e.g. with hexdump, and observe that it equals the predefined data.
Apply a URL rewrite policy for header and body, as simple as it gets, as in the image below.
Expected behaviour
The binary data in the response equals the predefined binary data, that is
\x00\x00\x00\x00\xff\xff\xff\xff
.Current behaviour
The binary data in the response is
\x00\x00\x00\x00\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd
. The\xff
get turned into 'REPLACEMENT CHARACTER' (U+FFFD).Useful information
I have also verified that binary data don't get mangled when the client communicates directly with the endoint.
I have not observed any problems with data in utf-8 encoding.
I have only been able to do tests with a SOAP API. As the HTTP headers seem pretty standard I assume the bug is unrelated to the protocol.
cURL verbose output:
< HTTP/2 200 < cache-control: private < content-type: multipart/related; type="application/xop+xml";start="<http://tempuri.org/0>"; boundary="uuid:807159d0-9955-43f4-9c6a-2e83d5d5a3ec+id=486";start-info="text/xml" < date: Tue, 11 Jul 2023 11:09:13 GMT < mime-version: 1.0 < server: Microsoft-IIS < set-cookie: ASP.NET_SessionId=s3g1xlx1czihcxjochdqujxi; path=/; HttpOnly; SameSite=Lax; Sec ure < x-gravitee-request-id: 8ce96063-3290-4767-a960-633290c76764 < x-gravitee-transaction-id: 8ce96063-3290-4767-a960-633290c76764 < x-content-type-options: nosniff
Response data:
Content-ID: <http://tempuri.org/0> Content-Transfer-Encoding: 8bit Content-Type: application/xop+xml;charset=utf-8;type="text/xml"
Content-ID: <http://tempuri.org/1/638246777541059478> Content-Transfer-Encoding: binary Content-Type: application/octet-stream
[binary data]
Environment
APIM 3.15.21 (due to #9026)