Open GoogleCodeExporter opened 8 years ago
i think the problem is because they do not use an user agent when asking for
the html, and thus creates an error 403 in some websites, but you can try to
download the html manually and then send that to the
ArticleExtractor.INSTANCE.getText(String text) but i am not sure.
Original comment by jorgec...@gmail.com
on 17 Aug 2013 at 12:35
Original issue reported on code.google.com by
lopiccol...@gmail.com
on 28 Mar 2013 at 4:37