element-hq / element-web

A glossy Matrix collaboration client for the web.
https://element.io
GNU Affero General Public License v3.0
11.02k stars 1.96k forks source link

Message previews should format newlines as whitespace #16169

Closed HarHarLinks closed 2 years ago

HarHarLinks commented 3 years ago

Description

Reply previews are not formatted correctly (missing whitespace).

Steps to reproduce

Logs being sent: no

This is the preview I noticed. preview This is the corresponding message: message

Note that the message itself also seems messed up.

Version information

For the desktop app:

t3chguy commented 3 years ago

Can you share the event source for that edit? I have a feeling it was sent from Element Android which has a bug with how it sends edits. (Actually, I'm fairly certain of it)

HarHarLinks commented 3 years ago

I linked the event in question, which is in the public TWIM room. Anyway here's the source:

{
  "content": {
    "body": "* Thank you. They have all just popped up in AntennaPod.",
    "m.new_content": {
      "body": "> <@benpa:bpulse.org> yes. on list for tomorrow\n\nThank you. They have all just popped up in AntennaPod.",
      "format": "org.matrix.custom.html",
      "formatted_body": "<mx-reply><blockquote><a href=\"https://matrix.to/#/!xYvNcQPhnkrdUmYczI:matrix.org/$jW8QqTQEBPeqyVjPEflZH6JJFozceaVBNM2evZiCrKY?via=elsmussols.net&via=matrix.org&via=privacytools.io\">In reply to</a> <a href=\"https://matrix.to/#/@benpa:bpulse.org\">benpa</a><br />yes. on list for tomorrow</blockquote></mx-reply>Thank you. They have all just popped up in AntennaPod.",
      "msgtype": "m.text",
      "m.relates_to": {
        "m.in_reply_to": {
          "event_id": "$jW8QqTQEBPeqyVjPEflZH6JJFozceaVBNM2evZiCrKY"
        }
      }
    },
    "m.relates_to": {
      "event_id": "$qMjpr-vrYgdW-5beQS6EMt9tud6FDFGRgPxX6aw1Q7M",
      "rel_type": "m.replace"
    },
    "msgtype": "m.text"
  },
  "origin_server_ts": 1610749814588,
  "room_id": "!xYvNcQPhnkrdUmYczI:matrix.org",
  "sender": "@elmussol:elsmussols.net",
  "type": "m.room.message",
  "unsigned": {
    "age": 428635036
  },
  "event_id": "$GOpDh2r2WZk9pevPxI92L25DWPfg6N9fgydMr-D76fc",
  "user_id": "@elmussol:elsmussols.net",
  "age": 428635036
}

I have seen the issue of the badly formatted reply before in relation with element android, however this is the first time I noticed the badly formatted message preview in the room list. You will notice the body contains a space before the yes, and the formatted_body uses a <br \> in the same place. The preview should either break the line in the same place or (in my opinion better) replace the line break by a space for the preview only.

t3chguy commented 3 years ago

The preview will be using the formatted_body which has a <br /> which the Previews don't show but should probably turn them into at least a whitespace

valkum commented 3 years ago

The preview in the roomlist should probably ignore the content of <mx-reply> and show the actual reply, as that is the more important information.

t3chguy commented 3 years ago

It would if the reply was spec conforming. The fallback must be stripped from edit

valkum commented 3 years ago

I am not sure what you mean. https://github.com/uhoreg/matrix-doc/blob/aggregations-edits/proposals/2676-message-editing.md states that the fallback body in the outer object is there for a reason.

If I understand correctly, what I described should happen: https://github.com/matrix-org/matrix-react-sdk/blob/develop/src/stores/room-list/previews/MessageEventPreview.ts#L50. My understanding is that this code path should be taken, but it isn't in the screenshot.

t3chguy commented 3 years ago

m.new_content's body and formatted_body should NOT contain Reply Fallbacks.

As per the link you sent:

What happens when we edit a reply?

We just send an m.replace which refers to the m.reference target; nothing special is needed. i.e. you cannot change who the event is replying to.

The edited reply should ditch the fallback representation of the reply itself however from m.new_content (specifically the tag in the HTML, and the chevron prefixed text in the plaintext which we don't know whether to parse as we don't know whether this is a reply or not), as we can assume that any client which can handle edits can also display replies natively.

That last paragraph is not being done correctly in the client which was used to edit the event, creating this mess.

valkum commented 3 years ago

Thanks for pointing me to the correct part of the proposal. As a result I ask myself why synapse accepts such msgs in the first place. But that does not answer why the codepath for the preview is not taken here.

ReplyThread.stripHTMLReply(body) should be called (all conditions are met) and this should strip the mx-reply tag.

t3chguy commented 3 years ago

Synapse can't govern everything because this could equally be an encrypted room and thus it can't read it.

ReplyThread.stripHTMLReply(body) should be called (all conditions are met) and this should strip the mx-reply tag.

That would work for formatted messages but if it was a plaintext message there's no way to know if the > is intentional or part of a broken client's edit.

The sending client should just be fixed.

all conditions are met

Not quite, it checks body and formatted_body at the root not the ones inside m.new_content

valkum commented 3 years ago

Yes the sending client should be fixed. BUT. You cannot enforce clients in an open spec that allow such malformed events in the DAG. And thus the display part in element should also be fixed. As you can see in the screenshots, the formatted_body is displayed in the preview.

The sending client should just be fixed.

all conditions are met

Not quite, it checks body and formatted_body at the root not the ones inside m.new_content

eventContent is set the new_content here https://github.com/matrix-org/matrix-react-sdk/blob/3115c4f3f3ed0b2f1d82c5882dcce007b188c206/src/stores/room-list/previews/MessageEventPreview.ts#L31

But I think this discussion drifts from the original issue mentioned by HarHarLinks

t3chguy commented 3 years ago

The thing you are missing is the m.in_reply_to is inside the m.new_content (INVALID)

The code you linked, lower down checks only whether the root contains m.in_reply_to: https://github.com/matrix-org/matrix-react-sdk/blob/3115c4f3f3ed0b2f1d82c5882dcce007b188c206/src/stores/room-list/previews/MessageEventPreview.ts#L46-L47

And thus the display part in element should also be fixed.

There is no way to fix every other clients mistakes, handling some just masks the problem with that other client, making it harder to get fixed and also just yields less maintainable code.

valkum commented 3 years ago

The thing you are missing is the m.in_reply_to is inside the m.new_content (INVALID)

The code you linked, lower down checks only whether the root contains m.in_reply_to: https://github.com/matrix-org/matrix-react-sdk/blob/3115c4f3f3ed0b2f1d82c5882dcce007b188c206/src/stores/room-list/previews/MessageEventPreview.ts#L46-L47

What are the issues with using event['m.relates_to'] instead of event.getWireContent()['m.relates_to']? Why not strip the mx-reply tag from the formatted body every time, when previewing.

And thus the display part in element should also be fixed.

There is no way to fix every other clients mistakes, handling some just masks the problem with that other client, making it harder to get fixed and also just yields less maintainable code.

A malicious actor can change codepaths taken for sanitization by injecting a slightly malformed event into the DAG. I would consider stripping mx-reply from previews in every case, the more maintainable code.

t3chguy commented 3 years ago

What are the issues with using event['m.relates_to'] instead of event.getWireContent()['m.relates_to']?

non spec conforming.

Why not strip the mx-reply tag from the formatted body every time, when previewing.

Inconsistency with the ability to do the same for plaintext events.

I would consider stripping mx-reply from previews in every case, the more maintainable code.

I would prefer removing the fallbacks altogether as they are horrendous, for more maintainable code.

This talk of camouflaging issues with other clients, in the context of bad actors, is moot. If a bad actor wants to make their message look like shit they have many avenues to do that, What is the point of tying up 1/1,000,000 of them.

The current behaviour (following the spec) leaves more flexibility in the future for the spec to create backwards-compatible extensions to the current stuff, whereas always stripping mx-reply which might gain different connotations under different contexts might then cause undesirable effects.

HarHarLinks commented 3 years ago

image image