deathau / markdownload

A Firefox and Google Chrome extension to clip websites and download them into a readable markdown file.
Apache License 2.0
2.89k stars 226 forks source link

Images not rendering due to mismatched URIs #318

Open liujoshua opened 6 months ago

liujoshua commented 6 months ago

Issue

In reader mode, a message of "could not be found" appears instead of a rendered image.

This issue seems to be consistently reproducible with Substack pages. MarkDownload used to work with Substack pages. I have included an example of a page, downloaded on 2022-02-14 (images rendering) vs 2024-05-07 (images not rendering).

Screenshot 2024-05-07 at 12 10 20 PM

Investigation

The issue seems to be that the Markdown links are URL encoded. Previously, the file path was likewise URL encoded, and now the path where the image is saved substitutes the percentage signs with underscores.

Download Date File Path Markdown Link
2022-02-14T014141 ETH Scaling, ZK Projects and More/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6116b095-c612-4c7b-81b8-ec950ace310a_1152x508 [![[ETH Scaling, ZK Projects and More/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6116b095-c612-4c7b-81b8-ec950ace310a_1152x508.png]]](https://cdn.substack.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6116b095-c612-4c7b-81b8-ec950ace310a_1152x508.png)
2024-05-07T103932 images/ETH Scaling, ZK Projects and More/https_3A_2F_2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com_2Fpublic_2Fimages_2F6116b095-c612-4c7b-81b8-ec950ace310a_1152x508.png [![[images/ETH Scaling, ZK Projects and More/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6116b095-c612-4c7b-81b8-ec950ace310a_1152x508.png]]](https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6116b095-c612-4c7b-81b8-ec950ace310a_1152x508.png)
vvatikiotis commented 3 months ago

I can confirm that downloading images from Substack doesn't work.