matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.12k forks source link

Preview of a link is showing "Internet Explorer is not supported" #15598

Open yanetix opened 1 year ago

yanetix commented 1 year ago

Description

URL previews seem to fetched with an outdated useragent string claiming IE. This can trigger warnings to update the browser in the URL preview instead of a helpful preview

Steps to reproduce

Send a message with a link. (example https://delft.notubiz.nl/) Matrix server will fetch a preview. If this is a link to a website hat warns about outdated browsers, the preview shows the warning about outdated browser: "Gemeente Delft Ga direct naar: Navigatie Let op: U gebruikt Internet Explorer. Vanaf 31 maart 2021 wordt Internet Explorer 11 niet meer ondersteund als browser door dit portaal. Wij adviseren u om gebruik te maken van Microsoft Edge, Google Chrome of Mozilla FireFox."

translation: "Municipality of Delft Go directly to: Navigation Please note: You are using Internet Explorer. From 31 March 2021, Internet Explorer 11 will no longer be supported as a browser by this portal. We advise you to use Microsoft Edge, Google Chrome or Mozilla FireFox."

What did you expect?

A preview of the site that the link points to What happened instead?

The preview fetching passes a warning from the website about outdated browser, which is false. Probable cause

I suspect that the useragent string used to fetch the preview is incorrect and outdated. It would be better if Element uses a useragent string that is modern and aligns with the OS on which it is running. (Element on Linux, not IE on Windows) When testing with a useragent switcher tool (https://add0n.com/useragent-switcher.html) the website https://delft.notubiz.nl/ gives this warning when sending an IE useragent string, so the warning is useragent string dependent.

Homeserver

matrix.org

Synapse Version

"server_version":"1.83.0 (b=matrix-org-hotfixes,106fb7005d)","python_version":"3.8.12"

Installation Method

I don't know

Database

unknown

Workers

I don't know

Platform

unknown.

Configuration

No response

Relevant log output

Preview of https://delft.notubiz.nl/
"Gemeente Delft Ga direct naar: Navigatie Let op: U gebruikt Internet Explorer. Vanaf 31 maart 2021 wordt Internet Explorer 11 niet meer ondersteund als browser door dit portaal. Wij adviseren u om gebruik te maken van Microsoft"

Anything else that would be useful to know?

Not my server. Bug first reported in Element desktop: #961

clokep commented 1 year ago

URL previews seem to fetched with an outdated useragent string claiming IE. This can trigger warnings to update the browser in the URL preview instead of a helpful preview

This is an incorrect assumption, the user agent used by Synapse is "Synapse (bot; +https://github.com/matrix-org/synapse)".

The HTML source includes this message on every page and then uses JavaScript to not display it if it is "not IE". URL Previews don't support JavaScript though, unfortunately.

The main issue is that they include this text as the very first element on the page, which is what Synapse uses as the description if no Open Graph metadata is found. It is just a div with an ID of ByeByeIE, we do have a list of elements we ignore when attempting this:

https://github.com/matrix-org/synapse/blob/ba6b21c81e67583ac850eab5d96fe5666620d614/synapse/media/preview_html.py#L357-L372

🤷 Ignoring an element with an ID of ByeByeIE seems reasonable, but awfully specific... supporting JavaScript of course would be another option, but sounds wildly dangerous.

yanetix commented 1 year ago

tnx for investigating. Seems like an issue to be fixed by the website builder then. And I agree, supporting JavaScript in preview is a bad idea.

clokep commented 1 year ago

Seems like an issue to be fixed by the website builder then.

I think so. I'm unsure whether to close this or not -- it seems like a thing that other sites might do, but is such an edge-case I can't realistically see us try to fix it unless a lot of URLs have the same issue.