matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.82k stars 2.13k forks source link

User Directory leaks Per-room Nicknames and Avatars #5677

Open reivilibre opened 5 years ago

reivilibre commented 5 years ago

Update: October 2021

This issue has been resolved for a homeserver's local users.

We still need to address leaking per-room nicknames and avatars for remote users. This is complicated as we do not have an easy, obvious way to retrieve or keep up-to-date the public profile metadata for remote users.


Description

The User Directory leaks display names and avatars for a user that are sent in only one room. For example, by manually crafting a m.room.member state event – or recently using the /myroomnick command in Riot/Web, even if the state event is sent in a private room.

Steps to reproduce

Expected Behaviour

alice's original display name should be shown in the user search.

Implications

This has privacy implications – a nickname set in a private room with a close friend may be quite personal and perhaps embarrassing if seen by other users.

Version information

If not matrix.org:

not really relevant, I suspect:

reivilibre commented 5 years ago

Note that I have not tested this interaction over federation – I suspect that only servers receiving the m.room.member state event will update their user directories with the private nickname, but have not tested this.

Mikaela commented 3 years ago

Has this issue or it's implications been thought about in further detail than discussed in this issue? I recently blogged about inconsistencies with Element Web/iOS and Matrix/Synapse privacy issues and this has been the most surprising to many people who have commented on it, and these scenarios have been offered to me:

Matrix is so widely used that the possibilities for harm this issue may cause are nearly endless and I hope it's being worked on more than two deprecated labels from 2019 hint.

callahad commented 3 years ago

Hi @Mikaela, thank you for the thoughtful comment and thorough blog post.

This issue appears to have fallen through the cracks, in part because while the current behavior seems clearly undesirable, it's not entirely obvious what an objectively "right" behavior would be (in sufficiently rigorous detail to actually implement it).

We'll pick this up in triage and discuss later this week.

two deprecated labels from 2019

Element recently brought on @kittykat to help us figure out sustainable ways to manage our backlog and ensure that things like this don't slip through the cracks. For the moment, the "deprecated" labels are remnants of a previous, ad hoc system of triage. I didn't want to immediately delete those labels in case there was still some signal left in the noise, so they were all renamed to z-whatever and left in place.

Our current labeling / triage system is described in https://github.com/matrix-org/synapse/issues/9460, and we're slowly re-triaging our way through the backlog ordered by most recently updated. However, this system may yet change in response to Kat's research and recommendations.

reivilibre commented 3 years ago

In the user directory background process, we could ignore state events that occur in private rooms.

However, we still need some user-directory entry for users who are only in private rooms (because they will still show up for users who share a private room with them). In this case, we may need to fall back to querying their profile (over federation).

DMRobertson commented 3 years ago

Spent a lot of time reading through source code this afternoon. I think I've found the background processing that @reivilibre mentioned, and I think there are two code paths where synapse is going to update entries in user_directory_search without considering whether these rooms are public.

I need to understand the relationship between the user_directory_search and user_directory tables to make further progress.

More generally I'm a bit frazzled by the different combinations. Could do with some help working out what the desired behaviour should actually be.

I think there are three search terms to consider:

I say "current" because I don't think we want the search to take into account historical names.

The different circumstances that contribute to whether or not someone should be searchable:

reivilibre commented 3 years ago

@DMRobertson I think this might become a little bit clearer if you think of per-room nicknames as being an accident. The usual way to set nicknames/avatars is to call some endpoint:

and the server broadcasts the state events across all the users' rooms (as well as updating their 'profile'). If you think of it this way, it makes sense that the user directory updater will encounter any displayname update (state event) and believe that it's their profile displayname (remember, a big reason we have to do it this way is because we receive state events over federation too). (and finally, we can't just ignore state events in private rooms because we should show users that our users share a room with).

However, eventually people started sending state events manually to some (not all) rooms to have per-room nicknames.

I think I would advise you to focus on these two cases (search terms):

(the search logic can stay the same -- this information is cached in the tables you noted. You shouldn't have to get wound up in 'The different circumstances that contribute to whether or not someone should be searchable' since that already is correct as far as I know.)

and forget about this case:

I think it is fair enough for a user's 'main'/profile display name (as the one shown in the top-left of Element Web) to be the only one they can be found under in the search directory. At the very least, it would be better than the current situation today.

The strategy that I'm thinking of is that we should update the user directory when receiving a membership state event, but instead of trusting the information in the state event, we go and query the relevant profile.

extra thoughts/questions:

DMRobertson commented 3 years ago

Additional context on per-room names: matrix-org/matrix-doc#545 matrix-org/matrix-doc#3189

DMRobertson commented 3 years ago

Thanks @reivilibre, that's very helpful.

I think I would advise you to focus on [...] and forget about this case:

  • someone's current displayname specific to a room — maybe this would be nice, but it's difficult to do within the current structure and also perhaps odd UX-wise

Your note about difficulty is persuasive! I also note that there seems to be discussion onhow nicknames and profiles "properly".

I think it is fair enough for a user's 'main'/profile display name (as the one shown in the top-left of Element Web) to be the only one they can be found under in the search directory. At the very least, it would be better than the current situation today.

An observation: the name user_directory suggests a static, public list of users, but search suggests a personalised search. We end up with a bit of a hybrid where the results depend on the searcher (the rooms they're in) but return the public information only.

reivilibre commented 3 years ago

An observation: the name user_directory suggests a static, public list of users, but search suggests a personalised search.

That's true, though in the end, when you actually go and click invite on that user, the name you'll see in your room will be their profile/'public' one.

Perhaps it's worth mentioning that the user directory is what's accessed through the 'invite' screen in Element. I think it's kind of fair enough to treat this as a 'user directory' (static) search rather than it being personalised to a room you're in — the bit about only showing users you share a room with (unless they're in a public room) is really just for privacy.

I don't know how other software handles this situation (for comparison), but I'm fairly happy with the fact that this is already kind of edge-casey and that it's pretty sensible.

DMRobertson commented 3 years ago

Plan of attack here: https://gist.github.com/DMRobertson/b63d7caa1c7c1e39d88d98f27b233bc2

Edit: and part 2 for the server side here: https://gist.github.com/DMRobertson/2a6b2a0453daec29831d0e99156e1e3f

MadLittleMods commented 3 years ago

Same issue tracked in Element-web https://github.com/vector-im/element-web/issues/5123

DMRobertson commented 3 years ago

An update: I've committed several changes regarding the user directory: some refactors, others fixes.

I think it's now the case that local users' per-room profiles shouldn't be leaked to the directory. The particular changes here are:

Doing this for remote users is more involved, because we don't have easy access to remote users' public profiles. Doing this properly will involve further thinking. Synapse already has a remote profile cache system (though not one we seem to use in the directory?) so maybe we don't have to fetch these from scratch over federation.

callahad commented 3 years ago

I've edited the first comment to include a summary of remaining work (figuring out how to handle remote users), but as this turned out to be significantly messier than expected, we're going to unassign it and give @DMRobertson a bit of a breather. It will remain on our radar; if we've not revisited it by mid-January, please yell.

dngray commented 2 years ago

Search for 'Freddy' or 'alice' — @alice:example.org will be listed with the name 'Freddy'

  • Note: this assumes that alice is visible to bob in the user directory – i.e. alice is in a public room known to the homeserver AND/OR alice and bob share a private room together.

I've tried to reproduce this but cannot, didn't seem to matter of alice and bob were on the same home server or not.

I am confused, am I doing this wrong?

reivilibre commented 2 years ago

@dngray

I think you need to arrange your servers in this way:

In a DM between Alice and Bob, make Alice use /myroomnick. Charlie will now see that in the user directory / be able to search by that other name. (Of course, Alice would need to be in a public room visible to homeserver B first.)

(The issue is that server B doesn't know it's a 'private' name. Server A does though!)

ShadowJonathan commented 2 years ago

I've opened https://github.com/matrix-org/complement/issues/284 to create tests to counter this behaviour in the future, if anyone has any wishes, pointers, or specific behaviours they'd like to have tested, please comment so.

gpcf commented 2 years ago

What is the state of this? Honestly, leaking remote user names and avatars from private chats to the server directory is a massive violation of privacy, and keeping this issue open for three years is pretty much unacceptable for a chat protocol that brands itself around privacy. In the three years since this issue appeared, the very least that could have been done was to add a warning that this happens when you use the /roomnick command, since that command is often used to set an identity you do not want to be publicly known and expect to be kept private. IMHO if checking public profile data of remote users is "too hard", then maybe turning off that directory "feature" might be a better option.

Edit: So if I'm reading this right, the proposed solution for this is to validate private name changes with some public profile? How about not harvesting display names from private chats in the first place, as any /roomnick user would expect matrix to do? Also, the local "fix" of this issue makes it even worse, IMHO, since it makes it harder to see what is actually being shared.

reivilibre commented 1 year ago

the proposed solution for this is to validate private name changes with some public profile?

the proposed solution is to fetch name changes from the public profile over federation, only using private name changes as 'hints' that the profile may have changed.

How about not harvesting display names from private chats in the first place, as any /roomnick user would expect matrix to do?

this leads to a problem whereby the user doesn't have a name in the user directory and therefore you can't search for them. It's a bit unfortunate. Note that the proposed solution doesn't really 'harvest' the display names, it just uses the state change as a signal to refresh.

Also, the local "fix" of this issue makes it even worse, IMHO, since it makes it harder to see what is actually being shared.

I'm afraid this doesn't really follow. It was always hard to see what is actually being shared — unless you knew the mechanism inside-out and were extra careful. (Note well that the local view and a view on another server could have always differed before as well, for reasons which are fiddly to explain. At least the fix actually fixed part of the problem and got us closer to the solution!)

I don't know if /myroomnick is actually documented anywhere or whether it's widely-used — but as a kind of explanation, I'm pretty sure it was added as a bit of fun after people were messing around with m.room.member manually. The user directory just wasn't designed for that. (or so I gather … I wasn't around when all of this stuff was put together!)

It's not all bad news though; I've spent most of today knocking together a couple of PRs (#14755, #14756) which get most of the way to fixing this issue properly for remote users. The last remaining problem is doing something about the initial rebuild of the user directory. I don't think it's technically difficult but I didn't get enough time to do that part today.