zadam / trilium

Build your personal knowledge base with Trilium Notes
GNU Affero General Public License v3.0
26.83k stars 1.87k forks source link

(Bug report) Inconsistency import pictures with Trilium Web Clipper Addon #2621

Open Cyberklabauter opened 2 years ago

Cyberklabauter commented 2 years ago

Trilium Version

v0.50.1

What operating system are you using?

Windows

What is your setup?

Local (no sync)

Operating System Version

Win 10

Description

If I am clip a webpage with the Trilium Web Clipper addons in Chrome I get inconsistent results.

If I am trying to clip this main page: https://www.spiegel.de (using "Save whole page"). It won't save the hole page. But pictures are imported as inline style (they show up in the note itself).

If I am trying to clip a single news-page: https://www.spiegel.de/tests/haushalt/kaercher-philips-leifheit-vileda-saugwischer-im-test-a-62580356-74e4-4fc4-9182-8ba9278fb251 (using "Save whole page"). It saves the hole page. But pictures are imported and added as own child-note (they will not show up in the note itself).

Which method is used to import pictures is unpredictable to me.

How I would like to see it fixed: I think most user will prefer the inline style. So that picture show up in the note itself and shown in their orignal context .

But best would be, to let user decide which import style they prefer and have and option to chose.

zadam commented 2 years ago

Hi, can you please post the note source of the clipped page with missing images?

image

I tried to "clip whole page" from https://www.spiegel.de/tests/haushalt/kaercher-philips-leifheit-vileda-saugwischer-im-test-a-62580356-74e4-4fc4-9182-8ba9278fb251 and it seemed correct - all the images were displayed inline in the note.

It won't save the hole page.

The clipper uses an algorithm which attempts to simplify the page. This sometimes fails, especially if the page has complicated structure.

Cyberklabauter commented 2 years ago

Thanks for the quick response. The results seem unreliable to me. Testing multiple times I have differents result. Here are my findings:

In general there are three cases with the Web Clipper Addon:

  1. Images with the api <img src="api/images/zbtJbpbWUFv0/strawberry64.png"> are displayed inline correctly.

  2. Images that referred to a weblink are displayed with this little icon . The images themselves are added as child-nodes. The child-notes displays the image (instead of an icon). I have imported the following article https://www.spiegel.de/panorama/benedikt-xvi-mitarbeiter-von-joseph-ratzinger-praesentieren-faktencheck-a-604d9944-93f9-4655-8013-0723568fd679 several times. Sometime it fails and the source code shows: <img class="image_resized" style="width:899.333px;" src="https://cdn.prod.www.spiegel.de/images/8045f224-4973-41f4-9f61-35eed25bff70_w948_r1.778_fpx49.05_fpy45.jpg" alt="Papst Benedikt XVI. mit seinem persönlichen Sekretär, Kurienerzbischof Gänswein" srcset="https://cdn.prod.www.spiegel.de/images/8045f224-4973-41f4-9f61-35eed25bff70_w520_r1.778_fpx49.05_fpy45.jpg 520w, https://cdn.prod.www.spiegel.de/images/8045f224-4973-41f4-9f61-35eed25bff70_w948_r1.778_fpx49.05_fpy45.jpg 948w" sizes="100vw" width="948"> The same article was also successfully imported - with the correct api and tag for the image: <img src="api/images/aQT2n13FBpCM/8045f224-4973-41f4-9f61-35eed25bff70_w948_r1.778_fpx49.05_fpy45.jpg">

  3. No image is imported at all. No tag in the source code (maybe the result of the simplification process?). Example: https://www.spiegel.de/ausland/us-senat-bestaetigt-amy-gutmann-als-botschafterin-fuer-die-bundesrepublik-a-839730c6-cf11-4ced-b47b-54768bcd0ae1 (Somehow this page always fails consistently)..

I could also upload the whole source code if necessary.

Thanks for looking into it.

zadam commented 2 years ago

Hi, so case 3) is the simplification algorithm.

2) I don't know yet, there are some things which confuse me.

You mention icon, but the link on github is broken. Is this what you mean?

image

You also mention that when the image is broken (shows icon instead of image), the HTML is like <img class="image_resized" style="width:899.333px;" src="https://cdn.prod.www.spiegel.de/images ... but if I paste that, it does show me the image (albeit downloaded online and not locally).

I would appreciate if you could post me example(s) of such broken clippings in the form of ZIP exports:

image Thanks!

Cyberklabauter commented 2 years ago

Hi, so case 3) is the simplification algorithm.

I thought so. Anyways, the simplification algorithm is amazing. 👍

You mention icon, but the link on github is broken. Is image this what you mean?

Yes exactly, I made this link broken to show this icon. It is exactly the same icon which is shown in Trilium Notes.

You also mention that when the image is broken (shows icon instead of image), the HTML is like <img class="image_resized" style="width:899.333px;" src="https://cdn.prod.www.spiegel.de/images ... but if I paste that, it does show me the image (albeit downloaded online and not locally).

It does not for me. Only the broken link icons are shown in the note. The images appear in the child-nodes (each image has it's own child-node).

I would appreciate if you could post me example(s) of such broken clippings in the form of ZIP exports:

image Thanks!

Here are the ZIP exports of a working and a not working clipping example of the same site. I also added screen captures so that you get an idea how it looks like at my system.