kurtmckee / feedparser

Parse feeds in Python
https://feedparser.readthedocs.io
Other
1.91k stars 338 forks source link

Parsing multiple <dc:identifier> #145

Open herrlich10 opened 6 years ago

herrlich10 commented 6 years ago

Hi,

Thank you for providing this wonderful package. I notice that sometimes the rss feed item may contain multiple <dc:identifier>, but feedparser.parse() only returns one of them, presumably the last one. Is it possible to retrieve all of them into a list?

One example of such multi-identifier feed is the following: http://science.sciencemag.org/rss/current.xml It provides both doi and resource-id, where only resource-id is returned.

Thanks!

CodeTheInternet commented 5 years ago

Looks like identifier needs start and end methods, similar to contributor in the namespace script namespaces/dc.py

priyam-maheshwari commented 5 years ago

Has this been fixed?Facing same issue with feed "https://bmjopenrespres.bmj.com/rss/current.xml"

loic-bellinger commented 5 years ago

Also interested to know if this kind of parsing can be done with feedparser

lukasschwab commented 3 years ago

I think I'm encountering this issue with namespaced subelements of author elements in my package arxiv.py.

Here's an entry of the feed in question:

<entry>
  <id>http://arxiv.org/abs/2104.07569v1</id>
  <updated>2021-04-15T16:24:56Z</updated>
  <published>2021-04-15T16:24:56Z</published>
  <title>AffectiveNet: Affective-Motion Feature Learningfor Micro Expression Recognition</title>
  <summary>  Micro-expressions are hard to spot due to fleeting and involuntary moments of facial muscles. Interpretation of micro emotions from video clips is a challenging task. In this paper we propose an affective-motion imaging that cumulates rapid and short-lived variational information of micro expressions into a single response. Moreover, we have proposed an AffectiveNet:affective-motion feature learning network that can perceive subtle changes and learns the most discriminative dynamic features to describe the emotion classes. The AffectiveNet holds two blocks: MICRoFeat and MFL block. MICRoFeat block conserves the scale-invariant features, which allows network to capture both coarse and tiny edge variations. While MFL block learns micro-level dynamic variations from two different intermediate convolutional layers. Effectiveness of the proposed network is tested over four datasets by using two experimental setups: person independent (PI) and cross dataset (CD) validation. The experimental results of the proposed network outperforms the state-of-the-art approaches with significant margin for MER approaches.</summary>
  <author>
    <name>Monu Verma</name>
    <arxiv:affiliation xmlns:arxiv="http://arxiv.org/schemas/atom">Student, Member, IEEE</arxiv:affiliation>
  </author>
  <author>
    <name>Santosh Kumar Vipparthi</name>
    <arxiv:affiliation xmlns:arxiv="http://arxiv.org/schemas/atom">Member, IEEE</arxiv:affiliation>
  </author>
  <author>
    <name>Girdhari Singh</name>
  </author>
  <link href="http://arxiv.org/abs/2104.07569v1" rel="alternate" type="text/html"/>
  <link title="pdf" href="http://arxiv.org/pdf/2104.07569v1" rel="related" type="application/pdf"/>
  <arxiv:primary_category xmlns:arxiv="http://arxiv.org/schemas/atom" term="cs.MM" scheme="http://arxiv.org/schemas/atom"/>
  <category term="cs.MM" scheme="http://arxiv.org/schemas/atom"/>
</entry>

The issue is with how the arxiv:affiliation subfields are parsed: it seems only one of them is available on the resulting FeedParserDict, and it's available on the top-level dict instead of being associated with the author. In this case, that means some author affiliations are unavailable and it's unclear which author has that affiliation.

The full FeedParserDict for the entry produced by feedparser.parse():

{
  "id": "http://arxiv.org/abs/2104.07569v1",
  "guidislink": true,
  "link": "http://arxiv.org/abs/2104.07569v1",
  "updated": "2021-04-15T16:24:56Z",
  "updated_parsed": [2021, 4, 15, 16, 24, 56, 3, 105, 0],
  "published": "2021-04-15T16:24:56Z",
  "published_parsed": [2021, 4, 15, 16, 24, 56, 3, 105, 0],
  "title": "AffectiveNet: Affective-Motion Feature Learningfor Micro Expression\n  Recognition",
  "title_detail": {
    "type": "text/plain",
    "language": null,
    "base": "http://export.arxiv.org/api/query?search_query=&id_list=2104.07569v1&sortBy=relevance&sortOrder=descending&start=0&max_results=100",
    "value": "AffectiveNet: Affective-Motion Feature Learningfor Micro Expression\n  Recognition"
  },
  "summary": "Micro-expressions are hard to spot due to fleeting and involuntary moments of\nfacial muscles. Interpretation of micro emotions from video clips is a\nchallenging task. In this paper we propose an affective-motion imaging that\ncumulates rapid and short-lived variational information of micro expressions\ninto a single response. Moreover, we have proposed an\nAffectiveNet:affective-motion feature learning network that can perceive subtle\nchanges and learns the most discriminative dynamic features to describe the\nemotion classes. The AffectiveNet holds two blocks: MICRoFeat and MFL block.\nMICRoFeat block conserves the scale-invariant features, which allows network to\ncapture both coarse and tiny edge variations. While MFL block learns\nmicro-level dynamic variations from two different intermediate convolutional\nlayers. Effectiveness of the proposed network is tested over four datasets by\nusing two experimental setups: person independent (PI) and cross dataset (CD)\nvalidation. The experimental results of the proposed network outperforms the\nstate-of-the-art approaches with significant margin for MER approaches.",
  "summary_detail": {
    "type": "text/plain",
    "language": null,
    "base": "http://export.arxiv.org/api/query?search_query=&id_list=2104.07569v1&sortBy=relevance&sortOrder=descending&start=0&max_results=100",
    "value": "Micro-expressions are hard to spot due to fleeting and involuntary moments of\nfacial muscles. Interpretation of micro emotions from video clips is a\nchallenging task. In this paper we propose an affective-motion imaging that\ncumulates rapid and short-lived variational information of micro expressions\ninto a single response. Moreover, we have proposed an\nAffectiveNet:affective-motion feature learning network that can perceive subtle\nchanges and learns the most discriminative dynamic features to describe the\nemotion classes. The AffectiveNet holds two blocks: MICRoFeat and MFL block.\nMICRoFeat block conserves the scale-invariant features, which allows network to\ncapture both coarse and tiny edge variations. While MFL block learns\nmicro-level dynamic variations from two different intermediate convolutional\nlayers. Effectiveness of the proposed network is tested over four datasets by\nusing two experimental setups: person independent (PI) and cross dataset (CD)\nvalidation. The experimental results of the proposed network outperforms the\nstate-of-the-art approaches with significant margin for MER approaches."
  },
  "authors": [
    {"name": "Monu Verma"},
    {"name": "Santosh Kumar Vipparthi"},
    {"name": "Girdhari Singh"}
  ],
  "author_detail": {
    "name": "Girdhari Singh"
  },
  "arxiv_affiliation": "Member, IEEE",
  "author": "Girdhari Singh",
  "links": [
    {"href": "http://arxiv.org/abs/2104.07569v1", "rel": "alternate", "type": "text/html"},
    {"title": "pdf", "href": "http://arxiv.org/pdf/2104.07569v1", "rel": "related", "type": "application/pdf"}
  ],
  "arxiv_primary_category": {"term": "cs.MM", "scheme": "http://arxiv.org/schemas/atom"},
  "tags": [{"term": "cs.MM", "scheme": "http://arxiv.org/schemas/atom", "label": null}]
}

Edit: I think this issue may be a duplicate of #39; the discussion there includes some work-arounds which I haven't tested.