Open AndyTheFactory opened 12 months ago
Comment by codelucas Thu Feb 22 05:13:08 2018
Thanks for reminding us of this, will leave this task open for anyone to take it on.
Comment by torbenbrodt Fri Jul 13 12:16:59 2018
BBC uses schema.org I have tested it including #385 And it works (tested with https://www.bbc.com/news/business-43133853)
Issue by awiebe Wed Feb 21 07:38:28 2018 Originally opened as https://github.com/codelucas/newspaper/issues/521
https://github.com/codelucas/newspaper/blob/c0eed1a571ab91382ca7e86767b2c26e7e59bbd2/newspaper/extractors.py#L179
Strategy 3 (search body for date) is not implemented, so currently bbc articles, e.g. http://www.bbc.com/news/business-43133853
Which do not have any data in the URL or meta tags, do not have any publish date even though the date and time is clearly in the HTML as