Closed twhiteman closed 6 years ago
Is there any reason we can't use process.version
or process.versions
to determine which encoding to use? It would save us from having to unnecessarily calculate both.
@DonutEspresso the process.version tells you what version the client has, but it cannot tell you what version the server has. As such, if it cannot tell which version is used by the server, the client must then generate both hashes to be able to support both versions.
Thoughts?
I added the Github reference into the comment - so if the comment is not clear for the reader then there is more detailed information to be found here. Happy to take another wording if you want? Note that this change also supports running the client using an older node (v4-) version and a newer node (v6+) server.
I had thought about a client configuration option too (something like supportOldNodeHash), happy to add that if needed - but sometimes it is hard and/or invasive to change these bits (e.g. when it's a library used by another node_modules component).
Happy to take another wording if you want?
My re-comment attempt: restify/clients#173: Accept a Content-MD5 calculated for both a UTF-8 encoded body (the likely encoding for a string body and the default for node v6+ servers) and a latin-1 encoded body (the default, called 'binary' encoding, in node v4 and below servers).
I had thought about a client configuration option too (something like supportOldNodeHash), happy to add that if needed - but sometimes it is hard and/or invasive to change these bits (e.g. when it's a library used by another node_modules component).
Joyent's usage isn't going to need the option for a long while, so I don't feel a great need to add it yet. I'd defer to other reviewers if they wanted an option for this. For those that don't need to support the old node 'binary' coding, it does mean some perf hit for the extra string encoding and hash calc.
Proposal if doing it:
supportLatin1ContentMd5
option (default true)Update the comment and added supportLatin1ContentMd5 StringClient option to disable the latin1 "binary" hash calculation.
Tests all passing: 175 passing (5s)
Rebased on latest master.
Thanks @twhiteman for adding it as an option, much appreciated! 👍
Given that node4 and lower are in unmaintained status, should the default be false, rather than true?
Given that node4 and lower are in unmaintained status, should the default be false, rather than true?
I'm fine with that personally. @twhiteman ? Eventually having it default to false was why I suggested the option name as it was (I find having options by default false where possible is clearer and easier in the code).
Given that we have a lot of Triton components using restify-clients@1.x we will want to back port this there. We could make the option true by default for 1.x only if we want (separate discussion), but have it be false by default for 2.x and forward.
Sure, I can make the default false.
One thing of concern with this - as written this would change the behaviour of anyone using restify-clients on node.js 4 or older - as this would now use the "utf8" hash encoding (where previously it used the default, which is "binary" for node.js 4 and older).
I could remove "utf8" from the hash.update
call and let it use the node default, but then anyone using restify-clients on node.js 4, when talking to a server using node.js 6 would not be able to support that (the client would default to "binary", server to "utf8" - there is no restify client option to specify "utf8" encoding).
This boils down to needing a "supportAlternativeHashEncoding" option and checking process.version to enable that alternative?
@twhiteman Hrm. I'm inclined to default to false anyway. This is fixing a bug. The "binary" encoding was just already wrong. So, while it is unfortunate that a node v4 (or lower) user of restify-clients talking to a node.js server that is generating Content-MD5 incorrectly will get broken, they were already just getting lucky because of a bug. They have options:
Thoughts?
I don't want to make a change that modifies the behaviour for existing restify-clients (at least if it can be helped - I feel that sort of change should go with a major version bump).
I think adding a StringClient option (disabled by default) to support either of the content-md5 hash encodings is the best way forward - and leave the existing content-md5 hashing the same as it now (i.e. so it doesn't unexpectedly break things).
Julien and I discussed for a quite a while. Here is our proposed functionality to add for current restify-clients (2.x), other major revs, and potential future behaviours.
First for restify-clients 2.x, these are the default values:
var client = restifyClients.createJsonClient({
// ...
contentMd5: {
ignore: false,
encodings: undefined // optional array of encodings to consider
}
});
Here encodings: undefined
means "use the default encoding", which is
the current behaviour. IOW, this patch won't change default behaviour.
encodings
accepts an optional array of encoding names to consider.
A user that wanted to use both Content-MD5 encodings supported, to work with
node 4.x and 6.x clients and servers could specify:
var client = restifyClients.createJsonClient({
// ...
contentMd5: {
encodings: ['binary', 'utf8']
}
});
I think the docs should point this out, and also clarify that node's "binary" here really means "latin1" and not anything to do with uninterpreted Buffer bytes.
This proposal also adds an option to disable content-md5 verification at
all: ignore: true
. Potential reasons to use this might be:
match other current node.js-land HTTP clients (like requests and wreck) which don't handle a response Content-MD5 header for you;
avoid the processing time to validate Content-MD5;
avoid breaking when using a server where one knows its Content-MD5 handling is broken
We'll likely backport the same contentMd5: ...
support to restify-client
1.x for a number of Joyent apps that are still on the other major rev.
Our understanding of HTTP handling is that the right answer really is to calculate the Content-MD5 over the sent bytes, i.e. the Buffer contents before restify-clients or node's http library (I don't know where this is done) decodes it to a JS String, and not to encode the just decoded body back to bytes for the hash digest calculation.
For a future major rev of restify-clients, if a contributor does the work to support calculating the Content-MD5 hash over the bytes before JS String decoding, then we could have the following options:
var client = restifyClients.createJsonClient({
// ...
contentMd5: {
ignore: false,
raw: true, // means verify Content-MD5 of Buffer `body`
encodings: undefined // **changes* to mean, do not check String `body`
}
});
IOW, the restify-clients default would change to check Content-MD5 on the raw Buffer bytes. Previous functionality is still available.
@yunong did you have a chance to look at this again? Thanks, Todd.
Given that todd's current changes match what Julien and I agreed on and that current default behaviour is not changed, I'll going to dismiss yunong's blocking review tomorrow and merge this. Then I think we'll likely backport it to 1.x for some of Triton's current usage of restify-clients 1.x.
Please let me know if that sounds unreasonable.
Yikes, sorry for the radio silence on this. If you had previously aligned with @misterdjules feel free to go ahead.
@twhiteman All good from my perspective!
All tests passing for me: 174 passing (5s)