LibXMTP is a shared library encapsulating the core functionality of the XMTP messaging protocol, such as cryptography, networking, and language bindings.
So it's unavoidable that we'll have multiple MLS groups under the hood for the same messaging pair (example: race condition where I message you and you message me at the same time). What we need to make sure of is that even if we have multiple MLS groups under the hood, app developers and users will only see one consistent DM. These are our requirements:
When fetching messages from any of the MLS groups associated with a DM conversation, the SDK will respond with messages from all of the groups.
When sending messages on a DM conversation, all installations in that DM will eventually converge to sending them on the same MLS group, even if they originally start off using different ones.
Suggested approach
My suggestion is to add a high-level concept of a dm_id that is separate from the existing low-level concept of a group_id. The dm_id is defined as "dm:{lower_inbox_id}:{higher_inbox_id}".to_lower_case(), and would be a nullable field added to the groups and messages tables here (will need to create a DB migration). Note:
There is a 1:N relationship between dm_id and group_id.
There is a 1:1 relationship between a dm_id and a messaging pair - it's impossible for two people to have different dm_id's when DMing each other.
When starting a 1:1 conversation, we should first check for existing MLS groups matching that dm_id (requirement (2)), and if none are found, we would create a new MLS group with the dm_id field set.
Note 1: There is already an existing field on the groups table called dm_inbox_id. I'm suggesting the dm_id field instead because it is globally consistent across all participants in the DM, and the code for computing it is a little simpler. I'm also suggesting writing it to both the groups and messages table. But using the existing field for this is fine also.
Note 2: The dm_id is not privacy-preserving - it's a local-only identifier stored in the user's database, and shouldn't be used on the server anywhere.
For requirement (1)
For 1:1 conversations, apps will never see the group_id for a DM, they will only ever see the dm_id. In other words, in the platform SDK's conversation_id would be defined as dm_id.as_bytes() for DM's and group_id for groups.
When fetching messages for a DM, libxmtp would query the messages table using the dm_id, which will give messages from all of the MLS groups associated with a DM.
For requirement (2)
When sending a message on a DM, the SDK should use the MLS group that is the most recently used. I suggest adding a last_message_ns field to the groups table that is updated on every message send. Then, when sending a DM, find the group_id with the latest last_message_ns for the given dm_id. This is the safest approach for new installations that do not have the message history, and ensures that all installations eventually end up on the same MLS group, letting the older groups fade into obscurity.
Problem to solve
So it's unavoidable that we'll have multiple MLS groups under the hood for the same messaging pair (example: race condition where I message you and you message me at the same time). What we need to make sure of is that even if we have multiple MLS groups under the hood, app developers and users will only see one consistent DM. These are our requirements:
Suggested approach
My suggestion is to add a high-level concept of a
dm_id
that is separate from the existing low-level concept of agroup_id
. The dm_id is defined as"dm:{lower_inbox_id}:{higher_inbox_id}".to_lower_case()
, and would be a nullable field added to the groups and messages tables here (will need to create a DB migration). Note:dm_id
andgroup_id
.dm_id
and a messaging pair - it's impossible for two people to have differentdm_id
's when DMing each other.When starting a 1:1 conversation, we should first check for existing MLS groups matching that
dm_id
(requirement (2)), and if none are found, we would create a new MLS group with thedm_id
field set.Note 1: There is already an existing field on the groups table called
dm_inbox_id
. I'm suggesting thedm_id
field instead because it is globally consistent across all participants in the DM, and the code for computing it is a little simpler. I'm also suggesting writing it to both the groups and messages table. But using the existing field for this is fine also.Note 2: The
dm_id
is not privacy-preserving - it's a local-only identifier stored in the user's database, and shouldn't be used on the server anywhere.For requirement (1)
group_id
for a DM, they will only ever see thedm_id
. In other words, in the platform SDK'sconversation_id
would be defined asdm_id.as_bytes()
for DM's andgroup_id
for groups.dm_id
, which will give messages from all of the MLS groups associated with a DM.For requirement (2)
When sending a message on a DM, the SDK should use the MLS group that is the most recently used. I suggest adding a
last_message_ns
field to thegroups
table that is updated on every message send. Then, when sending a DM, find thegroup_id
with the latestlast_message_ns
for the givendm_id
. This is the safest approach for new installations that do not have the message history, and ensures that all installations eventually end up on the same MLS group, letting the older groups fade into obscurity.Feedback welcome!