lukasschwab / arxiv.py

Python wrapper for the arXiv API
MIT License
1.11k stars 123 forks source link

Author affiliations missing from `Result.Author`s #62

Open lukasschwab opened 3 years ago

lukasschwab commented 3 years ago

Description

A clear and concise description of what the bug is.

Author affiliations are available in raw arXiv API feeds, but are not exposed by this package's Result objects.

Steps to reproduce

Steps to reproduce the behavior; ideally, include a code snippet.

Apparent for any result set.

Expected behavior

A clear and concise description of what you expected to happen.

Author affiliations should be exposed by the Result.Author class.

Versions

Additional context

Add any other context about the problem here.

This is a long-open issue in feedparser, perhaps open since 2015: https://github.com/kurtmckee/feedparser/issues/24. There's a detailed breakdown of the interaction with arXiv results here: https://github.com/kurtmckee/feedparser/issues/145#issuecomment-821762233. I suspect arXiv will release their JSON API ––and this client library will be rewritten to use the JSON API––before this feedparser bug is resolved.

This client library could expose the single author affiliation extracted by feedparser, but this has negative impacts:

If the single author affiliation is useful in your application, despite the noted downsides, access it with (Result)._raw.get('arxiv_affiliation').