Issues to be fixed in htmlparse in Telegraaf and Metro rss scrapers:
Metro htmlparser for text also catches some 'invisible' HTML that is not part of the main article text. (Likely they have CSS display: none applied?)
Telegraaf htmlparser is unable to parse some texts, because they are not included in the HTML, but only load after a script is run on the website. Possible solution... htmlsource is a string that has the text included in the script: "articleBody": "HERE IS THE TEXT.","author":
if text.strip() == "":
logger.warning("Trying alternative method....")
#parse the text from htmlsource```
Issues to be fixed in htmlparse in Telegraaf and Metro rss scrapers: