postlight / parser

📜 Extract meaningful content from the chaos of a web page
https://reader.postlight.com
Apache License 2.0
5.41k stars 442 forks source link

fix: make wired parser work with new format #511

Closed admbtlr closed 2 years ago

admbtlr commented 4 years ago

It looks like wired.com has changed its default format since the parser was written, e.g.

old: https://www.wired.com/2016/09/ode-rosetta-spacecraft-going-die-comet/ new: https://www.wired.com/story/chris-evans-rian-johnson-knives-out-wired25/

So I've updated the parser to handle this new format.

Thanks for maintaining this awesome project!

kour1er commented 4 years ago

I'm having trouble with this Wired article being truncated: https://www.wired.com/story/confessions-marcus-hutchins-hacker-who-saved-the-internet/

johnholdun commented 2 years ago

Closing this, as it appears that #604 solves the same problems and I happened to merge it before this one. Thanks for your contribution!