miniflux / v2

Minimalist and opinionated feed reader
https://miniflux.app
Apache License 2.0
6.49k stars 704 forks source link

Miniflux doesn't display text in links #1369

Open shishkin opened 2 years ago

shishkin commented 2 years ago

When fetching original content from decrypt.co (e.g. this article), links text is not displayed. Here is an example of such a link from page source:

<a href="/?post_type=post&amp;p=5736" target="blank" rel="noreferrer" class="sc-adb616fe-0 kVliVL"><span class="sc-3b5fdf4f-4 dvBzGX">Bitcoin</span></a>

Is this a bug or some feature I misconfigured? I don't have any custom rules.

fguillot commented 1 year ago

The content of this website seems to be rendered mostly via Javascript which is not interpreted by Miniflux.

shishkin commented 1 year ago

That page seems to use NextJS and ships hydrated HTML content in addition to JavaScript. I just tested and it renders perfectly with JavaScript completely disabled. All the content is in the div.post-content element and the link snippet I posted looks the same.

Could it be that Miniflux strips away some content?

shishkin commented 1 year ago

Now looking at the code of readability module, I suspect that readability removed elements that have -ad in their class. The removed links just happen to have class sc-adb616fe-0 which is a generated id which just happens to contain -ad. Maybe the regex could only check for -ad at word boundary?