With boilerpipe-1.2.0.jar
ArticleExtractor.INSTANCE.getText(new java.net.URL("http://t.co/3RplOLjc"))
produces
ERROR java.lang.IllegalArgumentException:
protocol = http host = null
at de.l3s.boilerpipe.sax.HTMLFetcher.fetch (HTMLFetcher.java:33)
at de.l3s.boilerpipe.extractors.ExtractorBase.getText (ExtractorBase.java:87)
This happens for many other URLs e.g. http://t.co/5vuYimwn http://t.co/Dy5yolLs
http://t.co/ShWhtFjP http://nyti.ms/lQrWwp ...
Original issue reported on code.google.com by johann.petrak@gmail.com on 22 Aug 2014 at 3:23
Original issue reported on code.google.com by
johann.petrak@gmail.com
on 22 Aug 2014 at 3:23