lwindolf / liferea

Liferea (Linux Feed Reader), a news reader for GTK/GNOME
https://lzone.de/liferea
GNU General Public License v2.0
818 stars 130 forks source link

Option to disable Readability.js #876

Closed Nudin closed 3 years ago

Nudin commented 4 years ago

The new function of adding Readability.js to the items, so that feeds get better readable is great! But sadly it does break some feeds. In one of my feeds the content is removes entirely, in another – for which I pay, to get a clean, add-free, full-text version – the images and preambles get removed. Can we please have an option to opt-out from Readability.js?

lwindolf commented 4 years ago

I agree there will be an option to disable Readability globally and per-feed.

Aside from this I'd be interested in broken feeds. Can you paste me the links? That'd help verifying if it is really Readability.

Nudin commented 4 years ago

Some of the affected feeds are personalised feeds, that I cannot share. Here is one more that is created from HTML-Mails by kill-the-newsletter.com: https://kill-the-newsletter.com/ -link-broken-to-protect-from-url-scrapers- feeds/uto8dfztofix16cvc7b9.xml

Nudin commented 4 years ago

PS FYI: Arch is already shipping 1.13.2, so more people might report bugs soon.

Nudin commented 3 years ago

The longer I use it the more I doubt that Readability.js should be enabled automatically, instead of by a button like in Firefox. Sometimes the change take pretty long. Often I'm already reading an article when suddenly everything jumps to another position and I lose where I was. Here's a video recording: Peek 2020-10-15 21-40

lwindolf commented 3 years ago

@Nudin What you are seeing is the delayed HTML5 extraction. Readability.js was already working before (and is after). I agree that the race condition between starting to read the article and the HTML5 extraction finishing is bothersome, but it is not caused by the new Readability.js feature, but was there for the entire 1.12 release.

Of course we might want to improve on this too by maybe merging items only after HTML5 extraction finishes. That would require a rework of the current item set queuing though.

lwindolf commented 3 years ago

On the other hand feedback gathered so far is weighting more to an Opt-In feature than Opt-Out as planned so far. If feedback continues in this direction be sure we will switch to Opt-In "Reader Mode". Currently I'm thinking of having a the browser style document icon button to persistently enable/disable per-feed.

Nudin commented 3 years ago

@Nudin What you are seeing is the delayed HTML5 extraction. Readability.js was already working before (and is after). I agree that the race condition between starting to read the article and the HTML5 extraction finishing is bothersome, but it is not caused by the new Readability.js feature, but was there for the entire 1.12 release.

This effect is definitely new in 1.13, I never saw it before the update and I just downgraded to 1.12.8 and with that older version it does not occur on the same feed and article. While on 1.13.2 it is reproducible. Also in the feed preferences the option "extract from HTML5 and Google Amp" is disabled and the feed is a proper full text RSS feed.

lwindolf commented 3 years ago

@Nudin Hmm... so it might be related. Of course if you have HTML5 extraction disabled then it has to be as you say.

In your screenshot the headline title changes once, is this because you clicked another headline during the shot?

Nudin commented 3 years ago

In your screenshot the headline title changes once, is this because you clicked another headline during the shot?

Yes, that is the start of the animation. Before the recording started the "Internet: …" news is selected. I started recording, then clicked on a new headline ("Artemis Accords") to demonstrate the delay.

lwindolf commented 3 years ago

@Nudin Hmmm... just checked the Golem feed. All the ones I've tested (RSS and Atom) only have 1 sentence content per-headline. So without HTML5 extraction enabled you never get full content... are you sure it is off?

Nudin commented 3 years ago

@Nudin Hmmm... just checked the Golem feed. All the ones I've tested (RSS and Atom) only have 1 sentence content per-headline. So without HTML5 extraction enabled you never get full content... are you sure it is off?

Yes, they offer a fulltext RSS feed for their paying customers.

lwindolf commented 3 years ago

@Nudin In that case could you provide a single item snippet from the feed for testing purposes?

Nudin commented 3 years ago

@Nudin In that case could you provide a single item snippet from the feed for testing purposes?

The issue disappeared through the update of 1.13.2 -> 1.13.3. I'll send you a copy of an example anyway.

lwindolf commented 3 years ago

I've just added a toggle button for reader mode in the item view (upper right corner).

image

Global preference for reader mode default is still to be done.

lwindolf commented 3 years ago

Preference option is now added in tab 'Privacy'. Default is enabled. So you have disable it once.