Closed Chandelier-02 closed 3 months ago
Holy shit, thanks so much for digging into this. You're right, there was a fundamental flaw with our message ID format. We switched it out for one that actually does what we want, using a unary prefix format instead of the MSB-padded format.
I have been using the Gossiplog for a while now. However, I noticed that the sync kept failing whenever I had above 16384 entries in the log. It correctly would sync and insert entries for clock values 0-255. But, when it should insert the entry with clock value 256, it would instead find and try to insert the entry with clock value 16384, causing an error to be thrown and the sync to fail.
After debugging for a long time, I discovered it may be an issue with the clock value encoding. I used the id generation code provided, and saw that this is the case.
Clock value 255 gets encoded to '817f' + encoded payload Clock value 256 gets encoded to '8200' + encoded payload Clock value 16383 gets encoded to 'ff17' + encoded payload Clock value 16384 gets encoded to '8180' + encoded payload
Because the keys are stored in sorted order, the entry with clock value 16384 got stored between entries with clock values 255 and 256. This causes the sync to fail every time. This is the problem causing what I was seeing in this issue, https://github.com/canvasxyz/canvas/issues/295, but figured it was worth its own issue now that I understand the problem and know it isn't a sync-specific issue.
Code to reproduce: