Open GoogleCodeExporter opened 8 years ago
[deleted comment]
Here's an update from 2011-12-08 on the above URL:s, using the web version of
boilerpipe:
*
http://www.dn.se/nyheter/vetenskap/annu-godare-choklad-med-hjalp-av-dna-teknik
- Misses the header altogether (dn.se has had a new design since then...)
*
http://www.sydsvenskan.se/malmo/article1346121/I-natt-bargas-det---forhoppningsv
is.html - picks up the comment section
* http://www.dn.se/sthlm/tva-raddade-ur-malarvak - picks up some teasers
instead of main text.
* http://www.expressen.se/nyheter/1.2280178/smhi-utfardar-klass-2-varning - One
teaser, and various text from popups
Minor artifacts:
* http://hd.se/skane/2011/01/06/mangder-med-sno-over-skane/ - - "Skriv ut" is
a link to print the article. "Bildmaterial" is a header from the sidebar".
"Dela" at the bottom is from the sharing feature
* http://www.dn.se/sthlm/misstankt-brott-bakom-ung-mans-dod - This one does no
longer have any artifacts, well done!
* http://www.expressen.se/noje/1.2280351/lotta-engberg-lamnar-bingolotto -
Misses main header and teaser
I don't know what magic Readability uses, but all of the above urls works
perfectly with Readability.
Original comment by EmilStenstrom
on 8 Dec 2011 at 9:08
http://www.anspress.com/index.php?a=2&cid=48&lng=az&nid=270848
Original comment by eyusi...@gmail.com
on 13 May 2014 at 1:44
Original issue reported on code.google.com by
EmilStenstrom
on 6 Jan 2011 at 2:43