davorg-cpan / xml-feed

The CPAN module XML::Feed
18 stars 22 forks source link

Relative Link attributes not being processed properly [rt.cpan.org #53661] #45

Open atoomic opened 5 years ago

atoomic commented 5 years ago

Migrated from rt.cpan.org#53661 (status was 'open')

Requestors:

From jwieland@gmail.com on 2010-01-13 18:00:01 :

The link attribute does not get processed correctly from:  
http://earthquake.usgs.gov/earthquakes/catalogs/7day-M5.xml

Notice the links are relative and should be appended by the base 
attribute.  To test here is a subset of the XML file.  Just copy into a 
file name 7day-M5.xml

<?xml version="1.0"?>
<feed xml:base="http://earthquake.usgs.gov/" 
xmlns="http://www.w3.org/2005/Atom" 
xmlns:georss="http://www.georss.org/georss">
  <updated>2010-01-13T17:24:37Z</updated>
  <title>USGS M5+ Earthquakes</title>
  <subtitle>Real-time, worldwide earthquake list for the past 7 
days</subtitle>
  <link rel="self" href="/earthquakes/catalogs/7day-M5.xml"/>
  <link href="http://earthquake.usgs.gov/earthquakes/"/>
  <author><name>U.S. Geological Survey</name></author>
  <id>http://earthquake.usgs.gov/</id>
  <icon>/favicon.ico</icon>
  <entry><id>urn:earthquake-usgs-gov:us:2010rkb8</id><title>M 5.3, 
Tonga</title><updated>2010-01-13T16:21:24Z</updated><link 
rel="alternate" type="text/html" 
href="/earthquakes/recenteqsww/Quakes/us2010rkb8.php"/><summary 
type="html"><![CDATA[<img 
src="http://earthquake.usgs.gov/images/globes/-15_-175.jpg" 
alt="15.741&#176;S 174.695&#176;W" align="left" hspace="20" 
/><p>Wednesday, January 13, 2010 16:21:24 UTC<br>Thursday, January 14, 
2010 06:21:24 AM at epicenter</p><p><strong>Depth</strong>: 10.00 km 
(6.21 mi)</p>]]></summary><georss:point>-15.7409 
-174.6951</georss:point><georss:elev>-10000</georss:elev><category 
label="Age" term="Past day"/></entry>
</feed>

And run this command
 perl -MXML::Feed -e 'my $feed = XML::Feed->parse("7day-M5.xml"); 
foreach my $e ($feed->entries) { print  $e->link , "\n"; } '

I could not figure out if this bug lies in XML::Feed or XML::Atom.  I 
ran the code through the debugger but for the life of me could not tell 
how it works.

Thanks for you time.

Jason

From davecross@cpan.org on 2016-02-13 09:59:48 :

On Wed Jan 13 13:00:01 2010, jwieland@gmail.com wrote:
> The link attribute does not get processed correctly from:  
> http://earthquake.usgs.gov/earthquakes/catalogs/7day-M5.xml
> 
> Notice the links are relative and should be appended by the base 
> attribute.  To test here is a subset of the XML file.  Just copy into a 
> file name 7day-M5.xml
> 
> <?xml version="1.0"?>
> <feed xml:base="http://earthquake.usgs.gov/" 
> xmlns="http://www.w3.org/2005/Atom" 
> xmlns:georss="http://www.georss.org/georss">
>   <updated>2010-01-13T17:24:37Z</updated>
>   <title>USGS M5+ Earthquakes</title>
>   <subtitle>Real-time, worldwide earthquake list for the past 7 
> days</subtitle>
>   <link rel="self" href="/earthquakes/catalogs/7day-M5.xml"/>
>   <link href="http://earthquake.usgs.gov/earthquakes/"/>
>   <author><name>U.S. Geological Survey</name></author>
>   <id>http://earthquake.usgs.gov/</id>
>   <icon>/favicon.ico</icon>
>   <entry><id>urn:earthquake-usgs-gov:us:2010rkb8</id><title>M 5.3, 
> Tonga</title><updated>2010-01-13T16:21:24Z</updated><link 
> rel="alternate" type="text/html" 
> href="/earthquakes/recenteqsww/Quakes/us2010rkb8.php"/><summary 
> type="html"><![CDATA[<img 
> src="http://earthquake.usgs.gov/images/globes/-15_-175.jpg" 
> alt="15.741&#176;S 174.695&#176;W" align="left" hspace="20" 
> /><p>Wednesday, January 13, 2010 16:21:24 UTC<br>Thursday, January 14, 
> 2010 06:21:24 AM at epicenter</p><p><strong>Depth</strong>: 10.00 km 
> (6.21 mi)</p>]]></summary><georss:point>-15.7409 
> -174.6951</georss:point><georss:elev>-10000</georss:elev><category 
> label="Age" term="Past day"/></entry>
> </feed>
> 
> 
> And run this command
>  perl -MXML::Feed -e 'my $feed = XML::Feed->parse("7day-M5.xml"); 
> foreach my $e ($feed->entries) { print  $e->link , "\n"; } '
> 
> 
> I could not figure out if this bug lies in XML::Feed or XML::Atom.  I 
> ran the code through the debugger but for the life of me could not tell 
> how it works.

It's not clear to me that this is a bug. Is there some standard which says that we should be returning absolute links if the feed contains relative links?

But whether or not the current method does the right thing, we are simply passing on the value that we get from XML::Atom. Adapting your example, we get:

$ perl -MXML::Atom::Feed -e 'my $feed = XML::Atom::Feed->new("7day-M5.xml"); foreach my $e ($feed->entries) { print $e->link->href , "\n"; } '
/earthquakes/recenteqsww/Quakes/us2010rkb8.php

So if there is a bug, it is a bug in XML::Atom and should be reported there.

Dave...

From davecross@cpan.org on 2016-02-13 10:05:28 :

On Wed Jan 13 13:00:01 2010, jwieland@gmail.com wrote:
> The link attribute does not get processed correctly from:  
> http://earthquake.usgs.gov/earthquakes/catalogs/7day-M5.xml
> 
> Notice the links are relative and should be appended by the base 
> attribute.  To test here is a subset of the XML file.  Just copy into a 
> file name 7day-M5.xml
> 
> <?xml version="1.0"?>
> <feed xml:base="http://earthquake.usgs.gov/" 
> xmlns="http://www.w3.org/2005/Atom" 
> xmlns:georss="http://www.georss.org/georss">
>   <updated>2010-01-13T17:24:37Z</updated>
>   <title>USGS M5+ Earthquakes</title>
>   <subtitle>Real-time, worldwide earthquake list for the past 7 
> days</subtitle>
>   <link rel="self" href="/earthquakes/catalogs/7day-M5.xml"/>
>   <link href="http://earthquake.usgs.gov/earthquakes/"/>
>   <author><name>U.S. Geological Survey</name></author>
>   <id>http://earthquake.usgs.gov/</id>
>   <icon>/favicon.ico</icon>
>   <entry><id>urn:earthquake-usgs-gov:us:2010rkb8</id><title>M 5.3, 
> Tonga</title><updated>2010-01-13T16:21:24Z</updated><link 
> rel="alternate" type="text/html" 
> href="/earthquakes/recenteqsww/Quakes/us2010rkb8.php"/><summary 
> type="html"><![CDATA[<img 
> src="http://earthquake.usgs.gov/images/globes/-15_-175.jpg" 
> alt="15.741&#176;S 174.695&#176;W" align="left" hspace="20" 
> /><p>Wednesday, January 13, 2010 16:21:24 UTC<br>Thursday, January 14, 
> 2010 06:21:24 AM at epicenter</p><p><strong>Depth</strong>: 10.00 km 
> (6.21 mi)</p>]]></summary><georss:point>-15.7409 
> -174.6951</georss:point><georss:elev>-10000</georss:elev><category 
> label="Age" term="Past day"/></entry>
> </feed>
> 
> 
> And run this command
>  perl -MXML::Feed -e 'my $feed = XML::Feed->parse("7day-M5.xml"); 
> foreach my $e ($feed->entries) { print  $e->link , "\n"; } '
> 
> 
> I could not figure out if this bug lies in XML::Feed or XML::Atom.  I 
> ran the code through the debugger but for the life of me could not tell 
> how it works.

I have just checked and, given a feed containing relative links, XML::RSS has exactly the same behaviour (the relative links are not converted to absolute links). I'm therefore becoming more convinced that our current behaviour is correct.

Dave...
atoomic commented 5 years ago

I do not think there is any bug there, it was more a question about relatives links. I think they are processed correctly.

I've added a unit test via 2b3178e with the provided one-liner

we should close this case