mozilla / readability

A standalone version of the readability lib
Other
8.34k stars 579 forks source link

Fix missing header from GitLab blog page #885

Closed revolter closed 5 days ago

fchasen commented 5 days ago

Thanks for submitting the test case and fix, the simpler fix for this seem to be to just remove tool from the regex as that doesn't appear to break any other test cases.

revolter commented 5 days ago

Wasn't it added there with a reason? Maybe there is a test case missing for that part of the regex? 🤔

revolter commented 5 days ago

Oh, it's there since the initial commit.

So, should I remove tool completely?

fchasen commented 5 days ago

Yep, seems to be fine to remove completely.

fchasen commented 5 days ago

Thanks again for the text case that has most of the text of the header in the id, that pattern doesn't work great with regex matching (for instance if the header was about Meta). We may look into this more soon depending on how common that is.

revolter commented 5 days ago

Good job with the contribution instructions 👏🏻 Was thinking about creating an issue at first 😅