lz4 / lz4-java

LZ4 compression for Java
Apache License 2.0
1.1k stars 253 forks source link

`StreamingXXHash32.getValue()` should return a long #57

Open meden opened 9 years ago

meden commented 9 years ago

Hello,

I'm using your implementation of xxHash to check the integrity of uploaded files in a web application. As you can imagine, client side I'm using the JS implementation found here.

Problems arise when the calculated hash is >= 2^31: while client side the hash is correctly sent as unsigned, server side the call to lStreamingXXHash32.getValue() returns a negative int, due to the overflow. This makes the two values not comparable (without some hack, at least).

Maybe it would be better change the StreamingXXHash32.getValue() implementation so that it returns a long. That way no overflow would occur, and the original unsigned value would be returned.

Thanks for you work!

jpountz commented 9 years ago

That's an interesting question. When I wrote the API, I hesitated between going with an int or a long whose 32 lower bits would be used, but preferred the int version as it made very clear that this hash function only works on 32 bits.

I would like to have more opinions on this one, if there seems to be consensus around moving to a long whose 32 lower bits would be used, I would be happy to change it.

Note to readers of this issue: You can turn the hash to an unsigned value by applying a mask, ie. StreamingXXHash32.getValue() & 0xFFFFFFFFL which will turn the integer value into an unsigned long.

ghost commented 9 years ago

I think it depends on the supporting a 'realistic use-case' -- if what @meden describes is absolutely a valid and potentially common use-case, then the API should probably support without the need to resort to bit ops.

If it's more of an edge case and not how the API was meant to be used or not in the spirit of it, then probably just great javadoc to clarify the bit ops approach.

My 2 cents and my eternal belief in the 'path of least surprise' when it comes to API design :)