A clear and concise description of what the bug is.
If id_list consists of a single nonexistent––but valid––ID, arXiv returns an empty feed which is interpreted to mean "no results."
If id_list consists of both existent and nonexistent valid IDs (["0000.0000", "1707.08567"]), the feed is non-empty––it contains a single item––but it has feed.feed.opensearch_totalresults == 2. The client takes this to be a partial page, and requests a page with offset 1... which lists paper 1707.08567 again. This is an API bug.
Notably, this behavior differs depending on the nonexistent ID. Nonexistent ID 1507.58567 yields an entry with missing fields (covered in #80, fixed by #82), whereas 1407.58567 yields no entries at all (covered here).
Description
If
id_list
consists of a single nonexistent––but valid––ID, arXiv returns an empty feed which is interpreted to mean "no results."If
id_list
consists of both existent and nonexistent valid IDs (["0000.0000", "1707.08567"]
), the feed is non-empty––it contains a single item––but it hasfeed.feed.opensearch_totalresults == 2
. The client takes this to be a partial page, and requests a page with offset 1... which lists paper1707.08567
again. This is an API bug.Notably, this behavior differs depending on the nonexistent ID. Nonexistent ID
1507.58567
yields an entry with missing fields (covered in #80, fixed by #82), whereas1407.58567
yields no entries at all (covered here).Example: https://export.arxiv.org/api/query?id_list=1407.58567,1707.08567
Steps to reproduce
Expected behavior
Results should not be duplicated.
Searching for
["0000.0000", "1707.08567"]
should yield a single result.Versions
python
version: 3.7.9arxiv.py
version: 1.4.1