Automattic / simplenote-electron

Simplenote for Web, Windows, and Linux
https://app.simplenote.com
GNU General Public License v2.0
4.69k stars 553 forks source link

Simple Note doesn't inline an image (that is publicly-acccessible and hosted on a drive.google.com) #3176

Closed emacksnotes closed 5 months ago

emacksnotes commented 5 months ago

Expected

Simple Note should inline a publicly-acccessible image hosted on a drive.google.com.

![(img) Simple Note-A](https://drive.google.com/uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34)

![(img) Simple Note-B](https://drive.usercontent.google.com/download?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34&authuser=0)

Observed

Simple note doesn't inline the image either on my electron app (see above screenshot) or on the published url (see http://simp.ly/p/mn7Hym)

Remotely hosted, public accessible image (typeset as markdown inline image) is not rendered

Simple Note-Inline Images-Screenshot-2024-01-30_11-33

Hypothesis / Suggestions

A simple wget on the https://drive.google.com/uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34 reports that the content is of type image/png, and an anonymous download of the image works just fine. So, I believe simplenote has to merely review the heuristics for what it qualifies as an inline image.

I hypothesize that simplenote could be using a regexp-matching for png, jpeg etc on the url. This heuristics wouldn't work for for the drive.google.com url.

I am no web expert, but if the original note is in markdown format, and if the note author uses

![img)(image url)

syntax, then simplenote should assume that the image url is indeed an image url (without looking for png or jpeg etc in the pattern), and check with the server hosting the URL if the content is in reality an image MIME type.

$ wget https://drive.google.com/uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34

--2024-01-30 10:59:44--  https://drive.google.com/uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34
Resolving drive.google.com (drive.google.com)... 2404:6800:4007:806::200e, 142.250.76.46
Connecting to drive.google.com (drive.google.com)|2404:6800:4007:806::200e|:443... connected.
HTTP request sent, awaiting response... 303 See Other
Location: https://drive.usercontent.google.com/download?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34 [following]
--2024-01-30 10:59:45--  https://drive.usercontent.google.com/download?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34
Resolving drive.usercontent.google.com (drive.usercontent.google.com)... 2404:6800:4007:823::2001, 142.250.195.129
Connecting to drive.usercontent.google.com (drive.usercontent.google.com)|2404:6800:4007:823::2001|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 27373 (27K) [image/png]
Saving to: ‘uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34’

uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2- 100%[======================================================================>]  26.73K  --.-KB/s    in 0.07s   

2024-01-30 10:59:46 (369 KB/s) - ‘uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34’ saved [27373/27373]

$

Reproduced

1. 2. 3.

Where did you see the bug

emacksnotes commented 5 months ago

As an additional note, I understand that the simplenote doesn't allow hosting of images in its servers.

So, this bug is about the heuristics that you use for what you consider as an image.

So, this bug is NOT about simplenote reviewing or revising its policy wrt inline images, but more about making things easier for people to host images on remote servers and have them rendered in the notes.

Just for added context ... I would like to share notes with my friends and family, and I would like to include images like screenshot etc in these notes. I would like to see inline images in the published --web-accessible--notes first, and in the electron app.

codebykat commented 5 months ago

Hi @emacksnotes! Thank you for this beautifully detailed report.

I agree that this should work, and we in fact recommend that folks do this to embed images in their notes (since we don't allow image attachments).

To your point about checking the headers for the file type, we aren't actually loading the file at any point in the process, but generating the HTML on the published page that renders the file from a third-party source.

Looking at the published note you shared, I can see that we are not in fact stripping the URL out. If you view the source, you can see that it has an <img link with the information you provided:

Screenshot 2024-02-01 at 20 29 54

This link, however, is returning a 403 Forbidden status from Google (you can see this on the Network tab of the browser console), which tends to mean that Google's servers are configured to prevent "hotlinking" of files. I don't use Drive much myself, and from poking around a little, couldn't see a way to get a different URL out of it, but it's possible you can get it to provide a "share URL" or "embed URL" or suchlike which would work with a third-party service.

It looks like this might be a recent Google change, per this link: https://stackoverflow.com/questions/77803187/having-trouble-displaying-an-image-from-google-drive

For now it seems the best solution would be to find somewhere else to host those images! I'm closing this issue as I don't think there's anything we can do about it, sadly. If a site isn't configured to allow cross-site image loading, it's not possible to circumvent it.

emacksnotes commented 5 months ago

It looks like this might be a recent Google change, per this link: https://stackoverflow.com/questions/77803187/having-trouble-displaying-an-image-from-google-drive

Thanks for the link.

I agree that it is NOT a simplenote issue.

It seems related to

(I have no understanding of web technologies. What I say above is my own understanding of what the above Stackoverflow link and the below Google issue users say )

I confirm what https://stackoverflow.com/a/77805668 says

Though not published anywhere (that I\'ve found), Google Drive\'s servers have begun rejecting requests where these two headers are attached:

sec-fetch-mode: no-cors
sec-fetch-site: cross-site

If you navigate your browser directly to the (direct link) URL of the file, (i.e. take your src URL and just paste it into the browser nav bar), the sec-fetch-mode header during that request will be set to navigate and it works just fine.

BUT, as with your <img> example, if the source of the request is a web page, and which is not the same origin as drive.google.com, hence the problem. (Your browser will automatically set those sec-fetch-mode and sec-fetch-site headers as part of the request.)

This appears to be an undocumented change by Google Drive that started on 2024-01-10 and I still can\'t find any mention of it anywhere, so it\'s unknown if this behavior will persist or if it reflects an accidental change or oversight.

Browsing directly to https://drive.google.com/uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34 results in 303 and works

303-response-2024-02-02_09-43

GET /uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34 HTTP/2
Host: drive.google.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
DNT: 1
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Sec-Fetch-Dest: document
Sec-Fetch-Mode: navigate
Sec-Fetch-Site: none
Sec-Fetch-User: ?1
Pragma: no-cache
Cache-Control: no-cache
TE: trailers

Visting a simplenote page that inlines https://drive.google.com/uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34 results in 403 403-response-2024-02-02_10-37

GET /uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34 HTTP/2
Host: drive.google.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0
Accept: image/avif,image/webp,*/*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://app.simplenote.com/
DNT: 1
Connection: keep-alive
Sec-Fetch-Dest: image
Sec-Fetch-Mode: no-cors
Sec-Fetch-Site: cross-site
Pragma: no-cache
Cache-Control: no-cache
TE: trailers

A workaround to inlining an image hosted on drive.google.com in a simplenote page

See http://simp.ly/p/5N17yV

To inline an image, hosted on drive.google.com don't do this

![(img)](https://drive.google.com/uc?id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34)

instead do this

![(img)](https://drive.google.com/thumbnail?sz=w5000&id=1wQISzfgY280eMmPUKqnbvhXGhRU2-w34)

or this

![(img)](https://lh3.googleusercontent.com/d/1wQISzfgY280eMmPUKqnbvhXGhRU2-w34?authuser=0)

This is based on the workaround suggested here Having trouble displaying an image from Google Drive | StackOverflow. See 403 Forbidden for https://drive.google.com/uc?export=download&id= [319531488] - Issue Tracker for more information.