Open Grienauer opened 1 year ago
There are multiple mentions in the issues section about header content being removed erroneously. I think this falls into the same problem.
I came here to report the same thing happening on Hackaday.com/blog
And https://www.thetimes.co.uk/ multiple articles, it clips the first one or two paragraphs on every page I'v tried. Kind of useeless in this state.
Currently on following pages the parser seems to be lost. I don't see any markup problems. maybe the newspapers detect and block the scraper?
https://www.derstandard.at/story/2000145508819/franzoesischer-verfassungsrat-stimmt-umstrittener-pensionsreform-zu there an info is added to the text, that some "software" is blocking stuff and it should be removed
https://kurier.at/wirtschaft/atomausstieg-wie-die-abschaltung-eines-kernkraftwerks-funktioniert/402412829 only one line of text
thx for info. happy to help.