Open kegsay opened 2 years ago
So the place where InsertMembership
is called is in membershipUpdaterTxn
which is called just after targetUserNID, err = d.assignStateKeyNID(ctx, txn, targetUserID)
when NewMembershipUpdater
is made. However, assignStateKeyNID
hits a cache first:
func (d *Database) assignStateKeyNID(
ctx context.Context, txn *sql.Tx, eventStateKey string,
) (types.EventStateKeyNID, error) {
if eventStateKeyNID, ok := d.Cache.GetRoomServerStateKeyNID(eventStateKey); ok {
return eventStateKeyNID, nil
}
// Check if we already have a numeric ID in the database.
eventStateKeyNID, err := d.EventStateKeysTable.SelectEventStateKeyNID(ctx, txn, eventStateKey)
if err == sql.ErrNoRows {
// We don't have a numeric ID so insert one into the database.
eventStateKeyNID, err = d.EventStateKeysTable.InsertEventStateKeyNID(ctx, txn, eventStateKey)
if err == sql.ErrNoRows {
// We raced with another insert so run the select again.
eventStateKeyNID, err = d.EventStateKeysTable.SelectEventStateKeyNID(ctx, txn, eventStateKey)
}
}
if err == nil {
d.Cache.StoreRoomServerStateKeyNID(eventStateKey, eventStateKeyNID)
}
return eventStateKeyNID, err
}
If it was possible for it to be in the cache but not the database, this would cause this error. Given this is a rare condition to hit, I wouldn't be surprised if the race logic here doesn't work as intended. If that were the case then the flow could be:
8
bails out (possibly due to conflicts from 7 or some other reason)7
continues to success.8
which isn't in the database.
`error="found 33449 users but only have state key nids for 33447 of them"``
Caused by
roomserver
code in storage:JoinedUsersSetInRooms
:It shouldn't be possible to have NIDs in the membership table but not in the state keys table, but apparently it is.