quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.57k stars 293 forks source link

Include image which links that are 302 redirect #4640

Open cderv opened 1 year ago

cderv commented 1 year ago

From https://community.rstudio.com/t/images-from-cloud-storage-direct-download-links-are-not-shown-in-quarto-html-document/160750 by @lnnrtwttkhn

Problem

Quarto HTML document does not display images from external direct download links to a publicly accessible cloud share provider (in this case Keeper, based on Seafile). The problem does not occur for e.g., the GitHub README.md file.

Demonstration

A demonstration of the problem can be found here: https://lennartwittkuhn.com/keeper-image/.

Quarto version

On my local machine:

$ quarto --version
1.2.313

Any hints appreciated! Thank you!

cderv commented 1 year ago

First investigation answered in community

The direct download link you used https://keeper.mpdl.mpg.de/f/ec510a79d3ab495eaf67/?dl=1 is a link with redirect.

it seems that Pandoc won't follow redirect and will do the same as in a browser. If I paste the link inside the browser I can this page https://keeper.mpdl.mpg.de/f/ec510a79d3ab495eaf67/?dl=1.png

While the SVG file is correctly downloaded to. It seems Github markdown preview will follow redirect, hence why it works

$ curl -I https://keeper.mpdl.mpg.de/f/ec510a79d3ab495eaf67/?dl=1
HTTP/1.1 302 Found
server: nginx
date: Mon, 06 Mar 2023 10:45:28 GMT
content-type: text/html; charset=utf-8
content-length: 0
location: https://keeper.mpdl.mpg.de/seafhttp/files/2e75692d-e929-4380-93ba-83c01c307c3f/KeeperLogo.svg
vary: Cookie, Accept-Language
content-language: en
set-cookie: sfcsrftoken=CML25R0X8h6r8UasyuBfNcxZCWmKSI8Ty6VUzBLRDcXx7iD95qBciQaPGvYjysGj; expires=Mon, 04 Mar 2024 10:45:28 GMT; Max-Age=31449600; Path=/; SameSite=Lax
strict-transport-security: max-age=31536000; includeSubdomains
set-cookie: SERVERID=app09-keeper.mpdl.mpg.de; path=/

Now if I follow the redirect in the URL provided I correctly get the SVG image

$ curl -I -L https://keeper.mpdl.mpg.de/f/ec510a79d3ab495eaf67/?dl=1
HTTP/1.1 302 Found
server: nginx
date: Mon, 06 Mar 2023 10:49:24 GMT
content-type: text/html; charset=utf-8
content-length: 0
location: https://keeper.mpdl.mpg.de/seafhttp/files/b7bbab02-a94e-4fe2-82a5-7d2cb1e7b24c/KeeperLogo.svg
vary: Cookie, Accept-Language
content-language: en
set-cookie: sfcsrftoken=T2Pg90JHul53lqcSHaoVG7ylphXcQqu3jNYqlrbfRMmO4dYwL50MpawXjqMsis1m; expires=Mon, 04 Mar 2024 10:49:24 GMT; Max-Age=31449600; Path=/; SameSite=Lax
strict-transport-security: max-age=31536000; includeSubdomains
set-cookie: SERVERID=app07-keeper.mpdl.mpg.de; path=/

HTTP/1.1 200 OK
server: nginx
date: Mon, 06 Mar 2023 10:49:24 GMT
content-type: image/svg+xml
content-length: 6202
last-modified: Mon, 06 Mar 2023 10:49:24 GMT
cache-control: max-age=3600
access-control-allow-origin: *
content-security-policy: sandbox
content-disposition: attachment;filename*="utf-8' 'KeeperLogo.svg"
x-content-type-options: nosniff
strict-transport-security: max-age=31536000; includeSubdomains
set-cookie: SERVERID=app09-keeper.mpdl.mpg.de; path=/
cache-control: private

I don't know if and where we could detect 302 redirect and replace the url instead of the user.

Maybe this is something that should be done upstream in Pandoc ?

lnnrtwttkhn commented 1 year ago

Thanks @cderv for opening the issue here and looking into the problem.

Just one more aspect that I noticed (also posted here): In my example, the direct download links works if used in the context of an About page, i.e., the image is displayed when I specify the direct download link in the image filed of the document's YAML header.

I've just updated the example website accordingly (see this commit for the exact changes).

So it seems like different download methods are used for information in the YAML header vs. in the main body of a Quarto document. Is this correct?

Thank you very much for your help!

lnnrtwttkhn commented 1 year ago

Just FYI: In the meantime, I found a workaround using the pre-render option of Quarto.

In short: I use wget in a Makefile to download the relevant image and include this in _quarto.yml.

Again, I've updated the example website accordingly and you can find the exact changes in this commit.

MohandSadakah commented 1 month ago

(xss-image](https://github.com/quarto-dev/quarto-cli/assets/47220452/51f40344-f764-4717-a700-9a0daadeba7b)