kohler / click

The Click modular router: fast modular packet processing and analysis
Other
743 stars 321 forks source link

Hash collisions in IPRewriter? #505

Closed Qcloud1223 closed 1 year ago

Qcloud1223 commented 1 year ago

Hello all,

I spent some time surveying how NAT/NAPT functionalities are implemented under cilck, but failed to find anything about handling hash collision during this process.

In IPRewriter::push from elements/tcpudp/iprewriter.cc, if map->get(flowid) returns a non-empty IPRewriterEntry, it will be applied to the TCP flow tcpmf without a check. Then I dug a little bit deeper into the get method of HashContainer, and found it just simply hashs the incoming key and returns the object if found (in include/click/hashcontainer.hh). Then won't a new flow whose hash collides with an existing entry be written into the existing entry?

Thanks in advance for your time.

tbarbette commented 1 year ago

Hi @Qcloud1223 !

I think the confusion comes from _rep.hashkey() which actually returns the full key, that is the flow.

get() calls find(), which will use bucket() to find the right bucket. bucket() find the hask of the key is obtained through hashcode(key). In the bucket, the "hashkey" are compared but that is a full compare, not hash compare.

Qcloud1223 commented 1 year ago

I think the confusion comes from _rep.hashkey() which actually returns the full key, that is the flow.

Exactly. One needs to check hashkey() inside IPRewriterEntry to find it returns _flowid, which is of class IPFlowID containing the four tuple.

Thanks as always @tbarbette ! I really appreciate it :)