postlight / parser

📜 Extract meaningful content from the chaos of a web page
https://reader.postlight.com
Apache License 2.0
5.37k stars 443 forks source link

fixed and improved extraction for latest layout of politico.com #701

Closed zhemaituk closed 1 year ago

zhemaituk commented 1 year ago

Fixed content extraction (otherwise couple of paragraphs of text are missing), updated author/date/dek extraction for the latest layout of the website.