Closed gregarndt closed 8 years ago
I suppose in addition to giving a more informative exception type in real "incorrect content hash" cases (ie once fixed mohawk will raise MisComputedContentHash
rather than MacMismatch
) it actually helps performance in the more straightforwards incorrect MAC cases too - since in those cases mohawk will no longer need to perform the more expensive hashing of the body at all.
If there are encoding differences between the Hawk client and a mohawk server then that sounds like a bug in mohawk. Can you provide example code to reproduce it?
according to a statement [3,4] in the node hawk implementation the hash provided in the request should be used.
Yes, but only as a theoretical optimization. What the spec says is that you could do a first-pass quicker MAC calculation using the client provided hash value simply to see if it fails. If it doesn't fail, you'd still need to calculate your own hash and re-calculate the MAC (which means you'd always be calculating a MAC twice) because otherwise an attacker could just swap out a known good hash in the request header and trick your server.
This is the key point in the section you linked to:
Without calculating the payload hash on the server and comparing it to the value provided by the client, the payload may be modified by an attacker.
I would consider adding a first-pass optimization to mohawk if there was a patch but I'm not sure if it's worth the effort. I'd like to see some metrics showing that it adds value. Also, I don't think it would address the problem you were seeing if there was an encoding problem so let's sort that out first :)
To reword the OP slightly:
STR:
Expected:
The exception is MisComputedContentHash
, which allows the user to realise more quickly what the problem was.
Actual:
MacMismatch
exception, since mohawk is using the calculated payload hash as part of the mac calculation, rather than the "provided" payload hash. As a result, the user is misled into thinking the auth problem lies with something other than the body of the message (and they don't think that the encoding hawk-side may be the problem.
Does that make sense? :-) (I've also filed https://github.com/hueniverse/hawk/issues/152 to make the need to manually UTF8 encode the hawk body be clearer in their docs; but it's still good to cover all of our bases IMO)
Did the payload have non-ascii characters in it? Could you give me an example payload that I could use to reproduce this?
The payload did indeed have non-ascii characters in it. If I recall correctly, it was a contributor's name that contained the characters. I don't have a payload anymore and our logs only go back so far, but I can try to dig something up.
There are two potentials bugs here:
MacMismatch
and MisComputedContentHash
give false explanations in the "the content hash was incorrect" case.Both of these can be seen from code inspection; is a specific testcase still useful? :-)
Both of these can be seen from code inspection; is a specific testcase still useful? :-)
No test case necessary. I'll create a non-ascii message and see what happens in hueniverse vs. mohawk.
I'm not totally clear on how the hueniverse package handles this. The thread you linked to in combination with the spec docs has contradictory explanations of how payloads should be encoded before hashing.
Thanks for the bug report :)
Ah I've just noticed - there's an existing test that shows we get MacMismatch
rather than MisComputedContentHash
:
https://github.com/kumar303/mohawk/blob/4309017b3d8b980601e56fe4635724d8b03f3489/mohawk/tests.py#L189-L192
btw, assuming the bug gets fixed, I don't think changing the exception to MisComputedContentHash
for a client that hashed its content wrong is going to help much. You really just have to turn on debug logging at the server level to see how it assembled all content parts when creating the hash then compare that to equivalent debug logging on the client side.
There is a note in the docs about how to debug MacMismatch
exceptions.
I don't think changing the exception to MisComputedContentHash for a client that hashed its content wrong is going to help much. You really just have to turn on debug logging at the server level
Yeah I agree in the worst case, but if enabling verbose logging on production can be avoided by giving a clue as to where to focus (or at the very least not giving an incorrect clue) - then it avoids the need to do production pushes, sift through potentially gzipped logs over multiple webheads [1] and try to coordinate with a third-party's logs (after having to get them to enable debug logging too).
Anyway I hope to have a PR for this soon :-)
[1] our current stage/prod doesn't use papertrail, once we move to Heroku we will be
As for the PR (fixing the encoding issue), I suppose we need to follow this as a guide: https://github.com/hueniverse/hawk/issues/152#issuecomment-155131505
There is another issue hidden in there: What if there is no payload (ie. in a GET-request)? Currently mohawk ends up interpreting this as an empty string, then calculating the hash and using that for the MAC-calculation.
If it doesn't fail, you'd still need to calculate your own hash and re-calculate the MAC (which means you'd always be calculating a MAC twice) because otherwise an attacker could just swap out a known good hash in the request header and trick your server.
IMHO the upstream implementation doesn't calculate the MAC twice. The implementation does:
There is no need to do another calculation of the MAC.
I've created a pull request for this issue: #28
Fixed in https://github.com/kumar303/mohawk/pull/28 - thanks @torotil !
It appears that mohawk calculates [1] the content hash even when a hash was provided in the request, and uses that calculated hash for validating the MAC [2] .that the content hash is created by mohawk and used for validating the MAC when according to a statement [3,4] in the node hawk implementation the hash provided in the request should be used.
One of the issues that comes up when not calculating using the provided hash is that if there are encoding differences with the payload, this will result in a different payload hash. However, the error that's returned is a mac mismatch when really the useful message about where to look is with a content mismatch, such as the one raised later on in base.py with the MisComputedContentHash exception.
One way to duplicate this is to use a different client for sending a hawk request that doesn't automatically encode the payload for you (such as the node hawk client). mohawk utf8 encodes the payload and calculates the hash while the node client allows you to use whatever encoding you wish.
[1] https://github.com/kumar303/mohawk/blob/master/mohawk/base.py#L42 [2] https://github.com/kumar303/mohawk/blob/master/mohawk/base.py#L48 [3] "If the payload is available at the time of authentication, the server uses the hash value provided by the client to construct the normalized string and validates the MAC" [4] https://github.com/hueniverse/hawk/blob/master/README.md#payload-validation