Closed gabiSRC closed 6 years ago
It looks like you are correct about this. If you'd like to provide a PR for this, I'd be happy to merge it in for you.
I don't think that
if (str.getBytes("UTF-8").length != str.getBytes().length)
is the correct way of checking for the existence of valid characters in the device ID, since str.getBytes()
will always use the default encoding that's probably different on the different machines. According to the documentation a device ID is a case-sensitive string (up to 128 characters long) of ASCII 7-bit alphanumeric characters plus certain special characters: - . + % _ # * ? ! ( ) , = @ $ '. This means only 1 byte per character and that's what the original check in the source code is doing. Right? Btw. On my machine the provided source code snippet works correctly. What device IDs have you registered on your IoT Hub?
I'm pretty curious about the device ids involved here, too. @gabiSRC Can you provide a list of your device ids?
@JMayrbaeurl @timtay-microsoft here are my IDs: So only authorized characters
I thing that because QueryCollectionResponse also contains the reportes/desired properties content it may have some characters that take more than 1 byte to be encoded in UTF-8 because when I am on debug mode, I see that the 2 strings are identicals
Yes. You are right. In the meantime I'm able to reproduce the bug. Simply add this to the desired properties section in the device twin "testwithDiacr": "§"
.
According to the documentation this is valid, since there's no restriction on the value of properties (see here : 'All values in JSON objects can be of the following JSON types: boolean, number, string, object. Arrays are not allowed.')
This means that's definetly a bug, since the complete read json string of the device twin is checked against single byte characters. But the real bug is in com.microsoft.azure.sdk.iot.deps.serializer.ParserUtility.validateStringUTF8(String)
, that uses the wrong UTF-8 character check: if(str.getBytes("UTF-8").length != str.length())
. And to be honest. There's simply no way to check a String! You'll have to do this on the byte array. And this is in com.microsoft.azure.sdk.iot.service.devicetwin.QueryCollection.sendQueryRequest(QueryOptions)
//Codes_SRS_QUERYCOLLECTION_34_017: [This function shall send an HTTPS request using DeviceOperations.]
HttpResponse httpResponse = DeviceOperations.request(this.iotHubConnectionString, this.url, this.httpMethod, payload, null, this.timeout);
//Codes_SRS_QUERYCOLLECTION_34_018: [The method shall read the continuation token (x-ms-continuation) and response type (x-ms-item-type) from the HTTP Headers and save it.]
handleQueryResponse(httpResponse);
//Codes_SRS_QUERYCOLLECTION_34_021: [The method shall create a QueryResponse object with the contents from the response body and its continuation token and return it.]
this.isInitialQuery = false;
return new QueryCollectionResponse(new String(httpResponse.getBody()), this.responseContinuationToken);
In the first line in DeviceOperations.request a HTTP request with UTF-8 charset is set up. But the conversion from a byte array to a String happens in the last line. Unfortunately completly wrong, since it's using the default encoding in new String(byte[]). Must be fixed as well!
Anyway. Checking the valid format of the device twin has to be done com.microsoft.azure.sdk.iot.deps.serializer.QueryResponseParser.QueryResponseParser(String)
on Json attribute level, checking keys and values differently.
PR with fix is now available
Great! Thank you :)
The fix for this issue has been checked in, and should be released within the week. Thanks for the detailed writeup, @gabiSRC and thanks for the PR @JMayrbaeurl!
Great, so, waiting for the new version's release then :)
I am closing the issue as all is OK.
OS and version used: macOs Sierra 10.12.6
Java runtime used: Java 1.8.0_10
SDK version used: iot-deps 0.5.3
Description of the issue:
When trying to get Twin devices, I get the exception
java.lang.IllegalArgumentException: parameter contains non UTF-8 character
Code sample exhibiting the issue:
I checked to method and I thing the comparison is wrong:
if(str.getBytes("UTF-8").length != str.length())
whereas it should beif(str.getBytes("UTF-8").length != str..getBytes().length)
Maybe I am wrong so if so, please could you help me solve this?
I already checked my devices data and there is no non-UTF-8 characters.
Thank you.