jhy / jsoup

jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.
https://jsoup.org
MIT License
10.94k stars 2.19k forks source link

Doesn't parse this feed properly 'feeds.bbci.co.uk/news/technology/rss.xml' #376

Closed ravindranathakila closed 10 years ago

ravindranathakila commented 10 years ago

URL: http://feeds.bbci.co.uk/news/technology/rss.xml Try this code (output below): (At the very bottom is the original source viewed on Google Chrome)

_Note what happens to the link tags_


 final Document document = Jsoup.parse(new URL(feedUrl).openStream(), "UTF-8", feedUrl);

        System.out.println("Document:" + document.toString());

        final Elements itemElements = document.getElementsByTag("item");
        Element[]  items = new Element[itemElements.size()];
        items =  itemElements.toArray(items);

        for (final Element item : items) {

            final String title = item.getElementsByTag("title").first().text();
            System.out.println("title:" + title);

            final String link = item.getElementsByTag("link").first().text();
            System.out.println("link:" + link);

            final String description = item.getElementsByTag("description").first().text();
            System.out.println("description:" + description);

        }

Output:


Document:<!--?xml version="1.0" encoding="UTF-8"?-->
<!--?xml-stylesheet title="XSL_formatting" type="text/xsl" href="/shared/bsp/xsl/rss/nolsol.xsl"?-->
<html>
 <head></head>
 <body>
  <rss xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"> 
   <channel> 
    <title>BBC News - Technology</title> 
    <link />http://www.bbc.co.uk/news/technology/#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
    <description>
     The latest stories from the Technology section of the BBC News web site.
    </description> 
    <language>
     en-gb
    </language> 
    <lastbuilddate>
     Fri, 22 Nov 2013 16:32:12 GMT
    </lastbuilddate> 
    <copyright>
     Copyright: (C) British Broadcasting Corporation, see http://news.bbc.co.uk/2/hi/help/rss/4498287.stm for terms and conditions of reuse.
    </copyright> 
    <img /> 
    <url>
     http://news.bbcimg.co.uk/nol/shared/img/bbc_news_120x60.gif
    </url> 
    <title>BBC News - Technology</title> 
    <link />http://www.bbc.co.uk/news/technology/#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
    <width>
     120
    </width> 
    <height>
     60
    </height>  
    <ttl>
     15
    </ttl> 
    <atom:link href="http://feeds.bbci.co.uk/news/technology/rss.xml" rel="self" type="application/rss+xml" /> 
    <item> 
     <title>Global launch for Microsoft Xbox One</title> 
     <description>
      The global launch of Microsoft's the Xbox One game console took place early on Friday, 22 November.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25037562#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25037562
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 00:22:32 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71263000/jpg/_71263203_71257876.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71263000/jpg/_71263204_71257876.jpg" /> 
    </item> 
    <item> 
     <title>Opposition to mobile chat on planes</title> 
     <description>
      Plans to allow mobile phone calls on commercial flights have met opposition from passengers who value peace and quiet in the skies.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25058358#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25058358
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 16:25:40 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71290000/jpg/_71290367_53293387.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71290000/jpg/_71290498_53293387.jpg" /> 
    </item> 
    <item> 
     <title>Google patents social media helper</title> 
     <description>
      If maintaining a presence on lots of social networks is a burden, Google may be able to help with software that pretends to be you.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25033172#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25033172
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 12:08:50 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71279000/jpg/_71279456_164332479.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71279000/jpg/_71279457_164332479.jpg" /> 
    </item> 
    <item> 
     <title>US prepares for more online gambling</title> 
     <description>
      Online gambling is launched in the state of New Jersey, a sign that the US may slowly be opening up to the multibillion-dollar industry.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25051312#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25051312
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 14:39:17 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71285000/jpg/_71285305_53406367.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71285000/jpg/_71285306_53406367.jpg" /> 
    </item> 
    <item> 
     <title>Web inventor in surveillance warning</title> 
     <description>
      The &quot;growing tide of surveillance&quot; threatens the democratic nature of the internet, warns the creator of the worldwide web, Sir Tim Berners-Lee.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25033577#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25033577
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 00:24:23 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71275000/jpg/_71275765_143671334.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71275000/jpg/_71275997_143671334.jpg" /> 
    </item> 
    <item> 
     <title>Danish trial for Pirate Bay founder</title> 
     <description>
      Pirate Bay founder Gottfrid Warg will be deported next week to Denmark to face charges of stealing confidential data.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25054054#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25054054
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 13:01:53 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71279000/jpg/_71279467_71279465.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71279000/jpg/_71279468_71279465.jpg" /> 
    </item> 
    <item> 
     <title>Samsung in $290m payout to Apple</title> 
     <description>
      A US jury rules that Samsung must pay $290m (&pound;180m) to Apple for copying iPhone and iPad features in its devices.
     </description> 
     <link />http://www.bbc.co.uk/news/business-25041852#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/business-25041852
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 02:47:51 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71269000/jpg/_71269953_181281527.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71269000/jpg/_71269954_181281527.jpg" /> 
    </item> 
    <item> 
     <title>LG promises fix for 'spying' TVs</title> 
     <description>
      TV maker LG admits collecting viewing information, even after users have disabled the function, and promises an immediate fix.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25042563#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25042563
     </guid> 
     <pubdate>
      Thu, 21 Nov 2013 17:49:03 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71267000/jpg/_71267591_lggg.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71267000/jpg/_71267592_lggg.jpg" /> 
    </item> 
    <item> 
     <title>Facebook sues over sex tape spam</title> 
     <description>
      Facebook launches legal action against an alleged spammer suspected of posting fake links to a supposed sex tape of Justin Bieber and Selena Gomez.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25033166#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25033166
     </guid> 
     <pubdate>
      Thu, 21 Nov 2013 11:42:46 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71252000/jpg/_71252718_gomez2.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71252000/jpg/_71252719_gomez2.jpg" /> 
    </item> 
    <item> 
     <title>Banks 'hit by net traffic hijacks'</title> 
     <description>
      Repeated attacks on the way the net routes data have resulted in huge amounts of traffic being hijacked, a net monitoring company says.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25033170#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25033170
     </guid> 
     <pubdate>
      Thu, 21 Nov 2013 16:27:34 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71257000/jpg/_71257862_184028997.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71257000/jpg/_71257863_184028997.jpg" /> 
    </item> 
    <item> 
     <title>UK 'let NSA store email addresses'</title> 
     <description>
      The UK allowed the US's National Security Agency to keep mobile phone numbers and email addresses of ordinary Britons from 2007, reports say.
     </description> 
     <link />http://www.bbc.co.uk/news/uk-25028495#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/uk-25028495
     </guid> 
     <pubdate>
      Thu, 21 Nov 2013 00:56:06 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71246000/jpg/_71246079_70825826.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71246000/jpg/_71246080_70825826.jpg" /> 
    </item> 
    <item> 
     <title>PM followed escort agency on Twitter</title> 
     <description>
      David Cameron's official Twitter account followed a high-class escort agency, it has emerged.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25015034#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25015034
     </guid> 
     <pubdate>
      Wed, 20 Nov 2013 17:48:47 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71240000/jpg/_71240483_187685259.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71240000/jpg/_71240484_187685259.jpg" /> 
    </item> 
    <item> 
     <title>Australia sites hacked amid spy row</title> 
     <description>
      Hackers attack Australian government websites amid an ongoing row over reports that Australia spied on the phone calls of the Indonesian prime minister.
     </description> 
     <link />http://www.bbc.co.uk/news/world-asia-25029261#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/world-asia-25029261
     </guid> 
     <pubdate>
      Thu, 21 Nov 2013 05:25:40 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71246000/jpg/_71246509_82300847.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71246000/jpg/_71246662_82300847.jpg" /> 
    </item> 
    <item> 
     <title>How UK banks contain cyber-threats</title> 
     <description>
      Staff at the UK's big banks are regularly being caught out by malware, BBC research suggests.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-24568134#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-24568134
     </guid> 
     <pubdate>
      Wed, 20 Nov 2013 00:16:55 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71211000/jpg/_71211630_piggy.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71211000/jpg/_71211631_piggy.jpg" /> 
    </item> 
    <item> 
     <title>Inflatable 1km solar chimney planned</title> 
     <description>
      Plans for a 1km (3,280ft) inflatable solar chimney are outlined by a leading balloon specialist.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25015030#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25015030
     </guid> 
     <pubdate>
      Wed, 20 Nov 2013 13:36:24 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71230000/jpg/_71230708_77244258.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71230000/jpg/_71230709_77244258.jpg" /> 
    </item> 
    <item> 
     <title>E-Sports settles Bitcoin hijack case</title> 
     <description>
      A games company agrees to a $1m (&pound;620.000) settlement to resolve allegations it broke the law by installing Bitcoin-generating code on its users' PCs.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25014477#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25014477
     </guid> 
     <pubdate>
      Wed, 20 Nov 2013 11:59:26 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71231000/jpg/_71231342_guin.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71231000/jpg/_71231343_guin.jpg" /> 
    </item> 
    <item> 
     <title>Shorter .uk net domain plan revived</title> 
     <description>
      A plan to allow people to run shorter &quot;name.uk&quot; websites rather than &quot;name.co.uk&quot; or other variants has been resurrected and will begin next year.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25006066#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25006066
     </guid> 
     <pubdate>
      Wed, 20 Nov 2013 11:11:17 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71227000/jpg/_71227265_dog.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71227000/jpg/_71227266_dog.jpg" /> 
    </item> 
    <item> 
     <title>China retains supercomputer crown</title> 
     <description>
      China's Tianhe-2 retains its status at the peak of the Top500 list of supercomputers, but IBM believes there are problems with the way it is calculated.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-24984320#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-24984320
     </guid> 
     <pubdate>
      Mon, 18 Nov 2013 14:32:45 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71182000/png/_71182905_chinasss.png" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71182000/png/_71182906_chinasss.png" /> 
    </item> 
    <item> 
     <title>VIDEO: Taxi drivers facing sat-nav ban</title> 
     <description>
      Taxi drivers in Bath look set to become the first in the country to be banned from using sat-navs for local journeys.
     </description> 
     <link />http://www.bbc.co.uk/news/uk-england-somerset-25023209#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/uk-england-somerset-25023209
     </guid> 
     <pubdate>
      Wed, 20 Nov 2013 15:48:00 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71236000/jpg/_71236266_71234513.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71236000/jpg/_71236267_71234513.jpg" /> 
    </item> 
    <item> 
     <title>VIDEO: Is this the world's smartest cab?</title> 
     <description>
      BBC Click's Spencer Kelly looks at the Tokyo taxi that alerts you if you have left anything on the back seat.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-24999364#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-24999364
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 08:41:45 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71213000/jpg/_71213609_smarttaxi.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71213000/jpg/_71213610_smarttaxi.jpg" /> 
    </item> 
    <item> 
     <title>VIDEO: Webscape: Peer-to-peer dining</title> 
     <description>
      Kate Russell reviews a site which allows travellers to eat at locals' houses plus other sites and apps.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-24925090#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-24925090
     </guid> 
     <pubdate>
      Tue, 19 Nov 2013 08:35:15 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71095000/jpg/_71095380_madewith1024.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71095000/jpg/_71095381_madewith1024.jpg" /> 
    </item> 
    <item> 
     <title>VIDEO: Bono and Apple join forces for charity</title> 
     <description>
      Bono and Apple's Senior Vice President of Design Sir Jony Ive join forces to create one-of-a-kind pieces for a charity auction.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25062395#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25062395
     </guid> 
     <pubdate>
      Sat, 23 Nov 2013 00:29:19 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71298000/jpg/_71298695_71298693.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71298000/jpg/_71298696_71298693.jpg" /> 
    </item> 
    <item> 
     <title>VIDEO: Making Google's Doctor Who doodle</title> 
     <description>
      BBC North America Technology Correspondent Richard Taylor meets the people behind Google's Doctor Who doodle.
     </description> 
     <link />http://www.bbc.co.uk/newsround/25039706#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/newsround/25039706
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 18:03:34 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71293000/jpg/_71293495_91e0daba-5ec1-47a6-99c2-b03d9ebab86b.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71293000/jpg/_71293496_91e0daba-5ec1-47a6-99c2-b03d9ebab86b.jpg" /> 
    </item> 
    <item> 
     <title>VIDEO: Thousands attend Xbox One UK launch</title> 
     <description>
      Thousands of people attended the London launch event for Microsoft's Xbox One on Thursday night.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25048759#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25048759
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 10:28:02 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71276000/jpg/_71276774_71272847.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71276000/jpg/_71276775_71272847.jpg" /> 
    </item> 
    <item> 
     <title>VIDEO: High-tech solution to eliminate acne</title> 
     <description>
      Scan Z say their smartphone accessory and app can predict and prevent acne. Technology reporter Dave Lee tried it for himself.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-24994003#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-24994003
     </guid> 
     <pubdate>
      Wed, 20 Nov 2013 00:32:20 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71188000/jpg/_71188799_71188346.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71188000/jpg/_71188800_71188346.jpg" /> 
    </item> 
    <item> 
     <title>VIDEO: Rival games consoles battle it out</title> 
     <description>
      The biggest names in gaming technology, PlayStation 4 and Xbox One, are going head-to-head in a battle to secure the top spot with games fans.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25043126#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25043126
     </guid> 
     <pubdate>
      Thu, 21 Nov 2013 19:45:15 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71268000/jpg/_71268531_71268529.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71268000/jpg/_71268532_71268529.jpg" /> 
    </item> 
    <item> 
     <title>VIDEO: Does the internet make us forgetful?</title> 
     <description>
      Researchers in America have suggested that people are relying on the internet as an extension of our own brains, and in turn causing us to neglect our memories.
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25030392#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25030392
     </guid> 
     <pubdate>
      Thu, 21 Nov 2013 10:40:54 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71249000/jpg/_71249202_71247333.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71249000/jpg/_71249203_71247333.jpg" /> 
    </item> 
    <item> 
     <title>VIDEO: New Orleans becomes 'Silicon Bayou'</title> 
     <description>
      The BBC visits the Big Easy to see how New Orleans is reinventing itself as a hub for technology start-ups.
     </description> 
     <link />http://www.bbc.co.uk/news/magazine-25006461#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/magazine-25006461
     </guid> 
     <pubdate>
      Wed, 20 Nov 2013 02:37:28 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71219000/jpg/_71219419_neworleansphoto.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71219000/jpg/_71219420_neworleansphoto.jpg" /> 
    </item> 
    <item> 
     <title>VIDEO: Can Twitter save you in a tornado?</title> 
     <description>
      #BBCtrending looks at how digital volunteers are sifting through thousands of tweets to help emergency relief efforts after natural disasters.
     </description> 
     <link />http://www.bbc.co.uk/news/magazine-24993953#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/magazine-24993953
     </guid> 
     <pubdate>
      Tue, 19 Nov 2013 12:01:49 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71206000/jpg/_71206603_tornado.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71206000/jpg/_71206604_tornado.jpg" /> 
    </item> 
    <item> 
     <title>The new meaning of spyware</title> 
     <description>
      What happens when spooks use hacker tools?
     </description> 
     <link />http://www.bbc.co.uk/news/technology-24931374#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-24931374
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 00:25:39 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71206000/gif/_71206912_spywareindex.gif" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71206000/gif/_71206913_spywareindex.gif" /> 
    </item> 
    <item> 
     <title>How do game companies share massive files?</title> 
     <description>
      How do game companies share massive files?
     </description> 
     <link />http://www.bbc.co.uk/news/business-25037653#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/business-25037653
     </guid> 
     <pubdate>
      Fri, 22 Nov 2013 00:10:19 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71267000/jpg/_71267771_battlefieldscreenshot.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71267000/jpg/_71267772_battlefieldscreenshot.jpg" /> 
    </item> 
    <item> 
     <title>Windows Phone 8 gains momentum</title> 
     <description>
      Microsoft's mobile chief on signs of a breakthrough
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25031051#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25031051
     </guid> 
     <pubdate>
      Thu, 21 Nov 2013 12:43:07 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71255000/jpg/_71255372_belf.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71255000/jpg/_71255474_belf.jpg" /> 
    </item> 
    <item> 
     <title>Tokyo Motor Show: New cars unveiled</title> 
     <description>
      World's leading carmakers show off their latest designs
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25024904#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25024904
     </guid> 
     <pubdate>
      Wed, 20 Nov 2013 18:22:34 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71242000/jpg/_71242357_covers.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71242000/jpg/_71242358_covers.jpg" /> 
    </item> 
    <item> 
     <title>Sexting: An open letter from parents to teenagers</title> 
     <description>
      An open letter from parents to teenagers
     </description> 
     <link />http://www.bbc.co.uk/news/magazine-25000800#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/magazine-25000800
     </guid> 
     <pubdate>
      Thu, 21 Nov 2013 10:03:48 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71241000/jpg/_71241851_sexting-promo-pic.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71241000/jpg/_71241852_sexting-promo-pic.jpg" /> 
    </item> 
    <item> 
     <title>UK-built cameras heading for space station</title> 
     <description>
      UK-built space station camera to video Earth
     </description> 
     <link />http://www.bbc.co.uk/news/science-environment-25005726#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/science-environment-25005726
     </guid> 
     <pubdate>
      Wed, 20 Nov 2013 10:00:16 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71208000/jpg/_71208708_hrc.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71208000/jpg/_71208709_hrc.jpg" /> 
    </item> 
    <item> 
     <title>Flowers grow out of computer code</title> 
     <description>
      Creating flowers out of computer code
     </description> 
     <link />http://www.bbc.co.uk/news/technology-25001635#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-25001635
     </guid> 
     <pubdate>
      Tue, 19 Nov 2013 12:24:31 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71203000/jpg/_71203099_arg2.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71203000/jpg/_71203100_arg2.jpg" /> 
    </item> 
    <item> 
     <title>PS4 and Xbox One ready for battle</title> 
     <description>
      Consoles ready for next-gen battle
     </description> 
     <link />http://www.bbc.co.uk/news/technology-24899400#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa 
     <guid ispermalink="false">
      http://www.bbc.co.uk/news/technology-24899400
     </guid> 
     <pubdate>
      Fri, 15 Nov 2013 00:04:01 GMT
     </pubdate> 
     <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71125000/jpg/_71125183_games.jpg" /> 
     <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71125000/jpg/_71125184_games.jpg" /> 
    </item> 
   </channel> 
  </rss> 
 </body>
</html>
title:Global launch for Microsoft Xbox One
link:
description:The global launch of Microsoft's the Xbox One game console took place early on Friday, 22 November.
title:Opposition to mobile chat on planes
link:
description:Plans to allow mobile phone calls on commercial flights have met opposition from passengers who value peace and quiet in the skies.
title:Google patents social media helper
link:
description:If maintaining a presence on lots of social networks is a burden, Google may be able to help with software that pretends to be you.
title:US prepares for more online gambling
link:
description:Online gambling is launched in the state of New Jersey, a sign that the US may slowly be opening up to the multibillion-dollar industry.
title:Web inventor in surveillance warning
link:
description:The "growing tide of surveillance" threatens the democratic nature of the internet, warns the creator of the worldwide web, Sir Tim Berners-Lee.
title:Danish trial for Pirate Bay founder
link:
description:Pirate Bay founder Gottfrid Warg will be deported next week to Denmark to face charges of stealing confidential data.
title:Samsung in $290m payout to Apple
link:
description:A US jury rules that Samsung must pay $290m (£180m) to Apple for copying iPhone and iPad features in its devices.
title:LG promises fix for 'spying' TVs
link:
description:TV maker LG admits collecting viewing information, even after users have disabled the function, and promises an immediate fix.
title:Facebook sues over sex tape spam
link:
description:Facebook launches legal action against an alleged spammer suspected of posting fake links to a supposed sex tape of Justin Bieber and Selena Gomez.
title:Banks 'hit by net traffic hijacks'
link:
description:Repeated attacks on the way the net routes data have resulted in huge amounts of traffic being hijacked, a net monitoring company says.
title:UK 'let NSA store email addresses'
link:
description:The UK allowed the US's National Security Agency to keep mobile phone numbers and email addresses of ordinary Britons from 2007, reports say.
title:PM followed escort agency on Twitter
link:
description:David Cameron's official Twitter account followed a high-class escort agency, it has emerged.
title:Australia sites hacked amid spy row
link:
description:Hackers attack Australian government websites amid an ongoing row over reports that Australia spied on the phone calls of the Indonesian prime minister.
title:How UK banks contain cyber-threats
link:
description:Staff at the UK's big banks are regularly being caught out by malware, BBC research suggests.
title:Inflatable 1km solar chimney planned
link:
description:Plans for a 1km (3,280ft) inflatable solar chimney are outlined by a leading balloon specialist.
title:E-Sports settles Bitcoin hijack case
link:
description:A games company agrees to a $1m (£620.000) settlement to resolve allegations it broke the law by installing Bitcoin-generating code on its users' PCs.
title:Shorter .uk net domain plan revived
link:
description:A plan to allow people to run shorter "name.uk" websites rather than "name.co.uk" or other variants has been resurrected and will begin next year.
title:China retains supercomputer crown
link:
description:China's Tianhe-2 retains its status at the peak of the Top500 list of supercomputers, but IBM believes there are problems with the way it is calculated.
title:VIDEO: Taxi drivers facing sat-nav ban
link:
description:Taxi drivers in Bath look set to become the first in the country to be banned from using sat-navs for local journeys.
title:VIDEO: Is this the world's smartest cab?
link:
description:BBC Click's Spencer Kelly looks at the Tokyo taxi that alerts you if you have left anything on the back seat.
title:VIDEO: Webscape: Peer-to-peer dining
link:
description:Kate Russell reviews a site which allows travellers to eat at locals' houses plus other sites and apps.
title:VIDEO: Bono and Apple join forces for charity
link:
description:Bono and Apple's Senior Vice President of Design Sir Jony Ive join forces to create one-of-a-kind pieces for a charity auction.
title:VIDEO: Making Google's Doctor Who doodle
link:
description:BBC North America Technology Correspondent Richard Taylor meets the people behind Google's Doctor Who doodle.
title:VIDEO: Thousands attend Xbox One UK launch
link:
description:Thousands of people attended the London launch event for Microsoft's Xbox One on Thursday night.
title:VIDEO: High-tech solution to eliminate acne
link:
description:Scan Z say their smartphone accessory and app can predict and prevent acne. Technology reporter Dave Lee tried it for himself.
title:VIDEO: Rival games consoles battle it out
link:
description:The biggest names in gaming technology, PlayStation 4 and Xbox One, are going head-to-head in a battle to secure the top spot with games fans.
title:VIDEO: Does the internet make us forgetful?
link:
description:Researchers in America have suggested that people are relying on the internet as an extension of our own brains, and in turn causing us to neglect our memories.
title:VIDEO: New Orleans becomes 'Silicon Bayou'
link:
description:The BBC visits the Big Easy to see how New Orleans is reinventing itself as a hub for technology start-ups.
title:VIDEO: Can Twitter save you in a tornado?
link:
description:#BBCtrending looks at how digital volunteers are sifting through thousands of tweets to help emergency relief efforts after natural disasters.
title:The new meaning of spyware
link:
description:What happens when spooks use hacker tools?
title:How do game companies share massive files?
link:
description:How do game companies share massive files?
title:Windows Phone 8 gains momentum
link:
description:Microsoft's mobile chief on signs of a breakthrough
title:Tokyo Motor Show: New cars unveiled
link:
description:World's leading carmakers show off their latest designs
title:Sexting: An open letter from parents to teenagers
link:
description:An open letter from parents to teenagers
title:UK-built cameras heading for space station
link:
description:UK-built space station camera to video Earth
title:Flowers grow out of computer code
link:
description:Creating flowers out of computer code
title:PS4 and Xbox One ready for battle
link:
description:Consoles ready for next-gen battle

Process finished with exit code 0

Google Chrome View Source:


<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet title="XSL_formatting" type="text/xsl" href="/shared/bsp/xsl/rss/nolsol.xsl"?>

<rss xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">  
  <channel> 
    <title>BBC News - Technology</title>  
    <link>http://www.bbc.co.uk/news/technology/#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
    <description>The latest stories from the Technology section of the BBC News web site.</description>  
    <language>en-gb</language>  
    <lastBuildDate>Fri, 22 Nov 2013 16:32:12 GMT</lastBuildDate>  
    <copyright>Copyright: (C) British Broadcasting Corporation, see http://news.bbc.co.uk/2/hi/help/rss/4498287.stm for terms and conditions of reuse.</copyright>  
    <image> 
      <url>http://news.bbcimg.co.uk/nol/shared/img/bbc_news_120x60.gif</url>  
      <title>BBC News - Technology</title>  
      <link>http://www.bbc.co.uk/news/technology/#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <width>120</width>  
      <height>60</height> 
    </image>  
    <ttl>15</ttl>  
    <atom:link href="http://feeds.bbci.co.uk/news/technology/rss.xml" rel="self" type="application/rss+xml"/>  
    <item> 
      <title>Global launch for Microsoft Xbox One</title>  
      <description>The global launch of Microsoft's the Xbox One game console took place early on Friday, 22 November.</description>  
      <link>http://www.bbc.co.uk/news/technology-25037562#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/technology-25037562</guid>  
      <pubDate>Fri, 22 Nov 2013 00:22:32 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71263000/jpg/_71263203_71257876.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71263000/jpg/_71263204_71257876.jpg"/> 
    </item>  
    <item> 
      <title>Inflatable 1km solar chimney planned</title>  
      <description>Plans for a 1km (3,280ft) inflatable solar chimney are outlined by a leading balloon specialist.</description>  
      <link>http://www.bbc.co.uk/news/technology-25015030#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/technology-25015030</guid>  
      <pubDate>Wed, 20 Nov 2013 13:36:24 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71230000/jpg/_71230708_77244258.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71230000/jpg/_71230709_77244258.jpg"/> 
    </item>  
    <item> 
      <title>E-Sports settles Bitcoin hijack case</title>  
      <description>A games company agrees to a $1m (£620.000) settlement to resolve allegations it broke the law by installing Bitcoin-generating code on its users' PCs.</description>  
      <link>http://www.bbc.co.uk/news/technology-25014477#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/technology-25014477</guid>  
      <pubDate>Wed, 20 Nov 2013 11:59:26 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/71231000/jpg/_71231342_guin.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/71231000/jpg/_71231343_guin.jpg"/> 
    </item>  
  </channel> 
</rss>
ravindranathakila commented 10 years ago

Sorry, had to truncate the results to be within the Github length for issues.

ravindranathakila commented 10 years ago

Tested on this feed too. Results are same.

<link></link> becomes <link />

http://feeds.rssboard.org/rssboard

ravindranathakila commented 10 years ago

Not a bug per se. There's a proper way to parse XML: Use _Parser.xmlParser()_

final Document document = Jsoup.parse(new URL(feedUrl).openStream(), "UTF-8", feedUrl, Parser.xmlParser());
jhy commented 10 years ago

Yes, you need to use the XML parser for XML. Otherwise, the HTML parser applies HTML rules. See the correct parse at http://feeds.bbci.co.uk/news/technology/rss.xml