Open chetankashetti opened 3 months ago
Hmm, so, what's seems to be happening here is that there are two verification add messages for the same address and fid.
one from May 27, with message hash "TsGP1MUa9rTjaNY42whm4ztP0KI" and one from June 2 with message hash "DyGKwXxmmpWnJnJXq9MFnV64Ty8="
When the June 2 message is merged, it correctly overwrites and deletes the May 2 one, because hubs have a unique constraint on the address. There is no remove message. A remove is message is not required to cause a delete, because a newer message will overwrite an older one (especially for verifications and UserDataAdd).
It looks like the events have made it into your kafka stream out of order. There's not enough information here to say what caused this. If you could provide the createdat and deletedat timestamps for these two messages from shuttle's messages
table for these two messages it will help narrow down whether it's happening in the hub, shuttle or at some point after.
Note that the message timestamp is different. Even though the delete event timestamp is newer, the timestamp of the message is older than the timestamp of the message in the merge event. So, a workaround would be to check for this.
To summarize,
Thanks for the response @sanjayprabhu
Having only delete message is totally fine. but there shouldn't multiple delete/merges?
we have these records in our messages
table
[
{
"id": "018f2e40-a33f-8cbc-27ca-81dab82afe3d",
"created_at": "2024-04-30 09:05:39.124098+00",
"updated_at": "2024-04-30 09:05:39.124098+00",
"timestamp": "2024-04-16 23:07:19+00",
"deleted_at": "2024-05-27 07:30:01.195+00",
"pruned_at": null,
"revoked_at": null,
"fid": 284564,
"type": 7,
"hash_scheme": 1,
"signature_scheme": 1,
"hash": "20 bytes",
"signer": "32 bytes",
"body": "{\"address\":\"0x2a39fd94e139952c3958e69172ba918d8f533571\",\"claimSignature\":\"0x6d3460cbbc770e42913d6ae8942f47d96c9ee57e79ba62c9aaf6061c522329f906f37b1f6449730bca0567b90987d865efbc21423202304063fe3c77597248141c\",\"blockHash\":\"0x97c6ab49b01c3af5347959ae5456d8d95e7731a4f4330800b11dcd380cae1502\",\"protocol\":\"ethereum\"}",
"raw": "267 bytes"
},
{
"id": "018fb8f4-c995-709e-8540-0be48b4d8ca3",
"created_at": "2024-05-27 07:30:01.178892+00",
"updated_at": "2024-05-27 07:30:01.178892+00",
"timestamp": "2024-05-27 07:29:58+00",
"deleted_at": "2024-06-02 14:25:06.781+00",
"pruned_at": null,
"revoked_at": null,
"fid": 284564,
"type": 7,
"hash_scheme": 1,
"signature_scheme": 1,
"hash": "20 bytes",
"signer": "32 bytes",
"body": "{\"address\":\"0x2a39fd94e139952c3958e69172ba918d8f533571\",\"claimSignature\":\"0xb499c194749ea31631b04dbd8f83ff1af59742934e4e507931b19d4e9887a84937a5f7a4ef5a088e1beb35ad169c0376051713599472aefaf3c04e612d8d7d561c\",\"blockHash\":\"0x9a87ed5ea1c6acf3f2196981bf9db78bb100e0c13bbaba4b66acaf259f483445\",\"protocol\":\"ethereum\"}",
"raw": "267 bytes"
},
{
"id": "018fd956-f8f5-4435-0ccc-53b232703349",
"created_at": "2024-06-02 14:25:06.780531+00",
"updated_at": "2024-06-02 14:25:06.780531+00",
"timestamp": "2024-06-02 14:25:02+00",
"deleted_at": null,
"pruned_at": null,
"revoked_at": null,
"fid": 284564,
"type": 7,
"hash_scheme": 1,
"signature_scheme": 1,
"hash": "20 bytes",
"signer": "32 bytes",
"body": "{\"address\":\"0x2a39fd94e139952c3958e69172ba918d8f533571\",\"claimSignature\":\"0x8d3656984f0c26538d441cc8b48aa8d9a8441ee71286d0e949510a28c13efb2e716fe6b6e810c57e17489e5acf052af1aa37afc25737cc5b9bedba56134d3ecf1c\",\"blockHash\":\"0x94696a9eb590bf2acb4bacb81148637771b0a2a4cf6d119816b396b9cd44dce8\",\"protocol\":\"ethereum\"}",
"raw": "267 bytes"
}
]
Having only delete message is totally fine. but there shouldn't multiple delete/merges?
It's valid at the protocol level to keep sending messages. Newer messages will always overwrite older messages. You can keep changing your UserData.DISPLAY_NAME 50 times, and it will generate 50 merges and 49 deletes. Only the message with the latest timestamp is valid, and in the messages table there will be only one that's not marked deleted.
The above example seems to be for a different FID, but we can walk through that
018f2e40-a33f-8cbc-27ca-81dab82afe3d
was created on April 30
018fb8f4-c995-709e-8540-0be48b4d8ca3
created on May 27, came in and deleted the message above (above messages was deleted_at a few ms after this one was created_at)
018fd956-f8f5-4435-0ccc-53b232703349
came in on June 2 and in turn deleted the above message (in this case, it looks like the delete event is .5ms after the create event)
I'm not sure why the delete event is after (even if it's by less than 1 ms), I checked the code and the callbacks should be called in order (delete first, then merge):
One potential reason is the they are not part of the same event, that's unlikely, but I'll double check to make sure.
You can add some logs in your MessageHandler to confirm, but the delete callback should be happening before the merge.
Note, these are all different message hashes. The user is submitting multiple verification messages for the same eth address. The hub is doing the right thing by deleting the old message and merging the latest message and generating a event for each.
The events are for unique messages, they are not duplicates
3. checking for the message timestamp can be a potential workaround while the root cause is determined for 2) above.
The problem with relying on the timestamp is, merge and delete events of same message hash will have the same timestamp. This can cause problem in our system because if they arrive in different order we won't know which is latest. so I think we can't rely on the timestamp. for different message hash of same address will have diff timestamp which i agree.
In the above example as you rightly said,
For the FID: 284564
first message: 018f2e40-a33f-8cbc-27ca-81dab82afe3d
created_at -> 2024-04-30 09:05:39.124098+00
deleted_at -> 2024-05-27 07:30:01.195+00
second message: 018fb8f4-c995-709e-8540-0be48b4d8ca3
created_at -> 2024-05-27 07:29:58+00
deleted_at -> 2024-06-02 14:25:06.781+00
third message -> 018fd956-f8f5-4435-0ccc-53b232703349
created_at -> 2024-06-02 14:25:02+00
we are not modifying anything in our handler, it processes in the order we receive. As you pointed out delete_at of 018fb8f4-c995-709e-8540-0be48b4d8ca3
has higher timestamp than creation of the message 018fd956-f8f5-4435-0ccc-53b232703349
.
So the order events streamed from hub seems to be wrong. if we consume in this order we delete in our system instead of add. that is resulting in inconsistent data.
Hey @chetankashetti, sorry for the delay.
EDIT See below comment instead.
One other point worth mentioning: if you look at how Shuttle is implemented (make sure you're running the latest version, as some code may have changed), you can see that Shuttle always processes the deleted messages before the merged message:
Given this, you can safely always treat handleMessageMerge
with a state
parameter of deleted
as safe to delete, and one with state
of created
as safe to create. This is much simpler than what I was previously discussing above, so I've updated my prior comment to avoid further confusion.
Just know that if you create your own handlers (I know you're not doing this—I'm just mentioning for completeness) and you don't use the state
/ operation
parameters, you will have to reinvent the CRDT logic locally.
What is the bug? Basically we are receiving delete events from Shuttle but when we check for hub using the
getAllVerificationMessagesByFid
it says its in added state. we are assuming whenever MESSAGE_TYPE_VERIFICATION_ADD_ETH_ADDRESS appears with merge state -> we add to our system delete state -> remove from our system prune state -> remove from our system revoke state -> remove from our system correct us if we're wrong. for ex. FID - 284564. If you observe we're adding a a timestamp when we receive calledevent_timestamp
. delete event is coming without any remove event.There are thousands of such FIDs which are getting affected with this issue. please try to investigate this. as its our top priority. Feel free to reach me out if more details are needed in this issue. farcaster-dc: chetansk
We are live streaming all the events to kafka where we noticed this. you can stream using shuttle which can
This is happening for lot of FIDs where delete events are coming but seems like there's no actual deletion happened.