Closed MrVPlusOne closed 6 years ago
It looks like you're creating a set of inputs with deliberately adversarial hash codes. We knowingly didn't design the immutable collections to deal with these. Do you have a case where these data structures degenerate on not obviously deliberately adversarial input?
No, this input pattern was automatically constructed using our research tool. I just think this might imply some potential security vulnerabilities or performance issues. Our tool found this input pattern in a black-box fashion and only used running time as feedback, so we are suspecting that, in other applications that use these immutable collections, an adversarial may use this kind of black-box fuzzing techniques to construct malicious attacks. There can be ways to defend against this kind of attacks. One solution would be to use a randomized smear function that changes its seed during program start up or periodically. But any other method that eliminates a fixed malicious input pattern should also work.
There can be ways to defend against this kind of attacks.
There's no solution which is secure, fast and not very complicated. Randomized smearing is either vulnerable (e.g., sun.misc.Hashing.murmur3_32 in Java 7) or rather slow. While SipHash was designed to counter the HashDos attack, it's still considerably slower than Hashing.smear. Moreover, for Strings, it's trivial to generate full collisions, so you'd need more than just smearing the existing hash. Using TreeNode
s like HashMap
since Java 8 does is probably fine, but pretty complicated.
Hi,
I found an input pattern that can trigger Ω(N^2) running time behavior of
ImmutableBiMap.copyOf
, the input is constructed and tested using the following code:On my machine, when running with
-Xint
, I got the following result:It's about 60X slowdown.
When running without
-Xint
and increase testNum from 4000 to 40000 (to reduce noise), the result was:I also tried to fit the performance-inputSize relationship, the result is a quadratic curve:
Note that this input pattern I found only have this quadratic behavior when size < 600, but this is due to a current technical limitation of our fuzzing tool used to find this input pattern, and we suspect there exist patterns that can scale to larger sizes.