Netflix / dynomite

A generic dynamo implementation for different k-v storage engines
Apache License 2.0
4.2k stars 532 forks source link

Fragmented queries silently fail cross-region replication #777

Closed smukil closed 3 years ago

smukil commented 4 years ago
  1. redis_fragment_argx() is in charge of fragmenting multi commands (like mset, mget, etc.) as they may belong to different dyno servers. As part of this, it cuts out multiple small mbufs and links them to cumulatively make a single node request.

    Eg: Assume 2 nodes (NODE1, NODE2) Query: "MGET key1 key2 key3" If 'key1' and 'key3' belong to NODE1, there will be 3 mbufs under the fragmented request sent to NODE1.

    MBUF1: *3\r\n$4\r\nMGET\r\n MBUF2: $4\r\nkey1 MBUF3: $4\r\nkey2

    Similarly, the frag'd request for NODE2 will have 2 linked mbufs.

    The mbufs are cut from the original request and linked to the fragmented ones for efficiency, and hence multiple MBUFs.

    Now to the problem. Since dynomite has this behavior of encrypting each individual mbuf, we also need to symmetrically decrypt one mbuf at a time. However, since the fragmented mbufs are small, they will be received into one large mbuf which will fail to decrypt as a whole.

    This patch fixes this by making sure all fragmented queries fit into a single mbuf unless it really needs 2. TODO: This is less efficient because of copying data around. Switch back to multiple small MBUFs if the crypto scheme is fixed.

  2. Since the fragmented query could also carry with it a key exchange, it could permanently disable cross-region communication between regions if the key exchange fails. This patch also fixes this. TODO: The crypto scheme should be more robust