mozilla / readability

A standalone version of the readability lib
Other
8.63k stars 589 forks source link

Reader Mode only shows comments on my website, not the main content #391

Closed labarandiaran closed 6 years ago

labarandiaran commented 7 years ago

Hello there

When I try to use Reader Mode on Firefox (either on windows or IOS), on certain pages from my site it doesn't show the main content, it only shows the comments section.

Example 1: https://www.universidadperu.com/empresas/banco-credito-peru.php

Screenshot: screenshot_1

Example 2: https://www.universidadperu.com/empresas/banco-internacional-peruinterbank.php

Screenshot: screenshot_2

I've tried searching for a developer guide that could give me pointers into what needs to appear on my code in order for Reader Mode to work properly, but couldn't find any.

I tried posting this on the Firefox support forum, but someone there pointed me here. https://support.mozilla.org/es/questions/1173652

Thanks for the review and advice!

Best regards.

Luis Alberto from Lima, Peru!

alex-mayorga commented 7 years ago

¡Hola!

I'm the someone from SuMo =)

Subscribing here so I can learn and help others with the same question in the future.

¡Gracias!

gijsk commented 6 years ago

Reader mode tries to isolate content based on scores. The scores are determined by the class names and ids and elements used in the content. https://github.com/mozilla/readability/blob/master/Readability.js#L113-L129 has most of that logic, encoded in regular expressions. From looking at your pages, there's no real article text there at all. The stuff you would like to see as the main body is all encoded in lists, rather than paragraph text. Wrapping that stuff into a <p> with the class main-content might help, as would adding classes to the comments section that get evaluated negatively (like comment, extra, etc.).

On the whole, I don't think this is an issue in readability, so I will close it, but if you need more information feel free to ask.

labarandiaran commented 6 years ago

Hello Gijsk. Per your suggestion, I've added a <div id="main-content">, and now works as expected on some of the pages, not all.

I'll look more into this.

Thanks for your help!

labarandiaran commented 6 years ago

Ok, so that it helps others, here's what I did:

Reader function now works as expected! Here's the screenshot: screenshot_12

Thanks again for the orientation! Best regards!