Closed GoogleCodeExporter closed 9 years ago
Attached is minimal example of the problem. It contains two items. The one
without
GUID gets image url of enclosure as id.
Original comment by floridan...@gmail.com
on 4 Jun 2010 at 3:00
Attachments:
This behavior is expected due to the advent of iTunes podcast feeds,
apparently. There are many unit tests in feedparser's itunes test directories
that specifically check to make sure that enclosure URLs are mapped to item IDs
if no ID is present.
I recommend closing this bug.
Original comment by kurtmckee
on 4 Dec 2010 at 3:37
Marked as WontFix since it's actually a feature not a bug. See the following
for background information:
http://techbrew.net/articles/200709/rss-the-guid-problem/
http://www.xml.com/pub/a/2004/08/18/pilgrim.html
http://diveintomark.org/archives/2002/09/26/rss_20_template
Original comment by adewale
on 4 Dec 2010 at 10:49
Is there then currently some other way to check whether an actual guid is given
in the feed? If not, then I'm not sure if this should be closed WontFix. As per
the link you give, people may want to implement a specific strategies of how to
identify items in case a guid is missing; those strategies may not be
compatible with using the enclosure as the first fallback.
In fact, I am writing a podcatcher-like application, and this is precisely such
a case; since the enclosure to me is very much the piece of attached *data*, I
specifically want to give publishers the ability to change the enclosure url
without detecting this as a new item. This is one of the reasons why I am
currently running a patched version of feedparser.
In addition, considering that Atom feeds allow more than one enclosure, and we
arbitrarily need to choose the first to use as a guid, this policy of using the
enclosure as a guid by default seems a bit like a kludge to me.
While I personally believe that feedparser itself should not select the
enclosure as a fallback at all, and instead require the user to deal with a
missing guid (really only the case for people writing an actual aggregator in
the first place), if we do chose to do this, I would at least advocate an
alternative way to get to the raw guid/atom:id.
Original comment by elsdoer...@gmail.com
on 5 Dec 2010 at 6:28
+1 on elsdoerfer's comment above. It should at least be configurable with an
argument to parse(), since this clearly isn't desirable behaviour in some cases.
Original comment by nickmurd...@gmail.com
on 5 Dec 2010 at 8:22
@elsdoerfer: Well said. You're correct, feedparser has included
application-specific logic that should be left to developers. It was
implemented to mimic how iTunes (the application) handles missing guids [1],
but as you've pointed out, that decision should be left to developers to decide
for their particular use cases.
I've attached a patch, and included a list of test cases that are no longer
necessary after the patch is applied. The changes are also available in a git
branch at:
https://github.com/kurtmckee/feedparser/tree/issue131
@Adewale: I think elsdoerfer has made an excellent point, and I strongly
recommend reconsidering the decision to close this report.
[1]: http://www.apple.com/itunes/podcasts/specs.html
"If you omit the guid for an episode, the episode url will be used instead."
Tests to remove:
/illformed/itunes/itunes_enclosure_url_maps_id.xml
/illformed/itunes/itunes_enclosure_url_maps_id_2.xml
/wellformed/itunes/itunes_enclosure_url_maps_id.xml
/wellformed/itunes/itunes_enclosure_url_maps_id_2.xml
/wellformed/itunes/itunes_link_enclosure_maps_id.xml
/wellformed/itunes/itunes_link_enclosure_maps_id_2.xml
Original comment by kurtmckee
on 6 Dec 2010 at 10:34
Attachments:
This is minor, but for fwiw, line 17 in the patch can also be removed, since
the href variable is no longer referenced.
Original comment by elsdoer...@gmail.com
on 7 Dec 2010 at 2:27
Good catch! Git branch updated, new patch file attached. Tested in Python 2.4 -
2.7.
Original comment by kurtmckee
on 7 Dec 2010 at 8:01
Attachments:
Apple makes little distinction between iTunes the product, iTunes the store and
the iTunes extensions for podcasting. They even go so far as to tell people
that the way to verify their feeds are correct is to point iTunes (the product)
at the feed.
On the one hand this means that the only correct way to implement the iTunes
extensions is to rigidly follow whatever iTunes (the product) does. On the
other hand this means attempting to mimic a closed source piece of software
rather a specification.
I've applied the patch (in revision 332) because I don't think we should be in
the business of trying to reverse engineer application functionality.
Although if the spec changes in a way that clears this up, I'm open to
revisiting this issue.
Original comment by adewale
on 12 Dec 2010 at 11:49
Original issue reported on code.google.com by
nickmurd...@gmail.com
on 5 Aug 2008 at 10:30Attachments: