adsabs / adsabs-dev-api

Developer API service description and example client code
163 stars 58 forks source link

Query of library content returns limited number of items #42

Open dlakaplan opened 6 years ago

dlakaplan commented 6 years ago

I am trying to follow the logic at https://github.com/adsabs/adsabs-dev-api/blob/master/Libraries_API.ipynb (or via python at https://github.com/adsabs/adsabs-dev-api/blob/master/Converting_curl_to_python.ipynb) to get all of the bibcodes associated with a library. However, while it returns the correct number of bibcodes in the metadata, the number of actual bibcodes returned is far smaller. e.g.,

{u'documents': [u'2018MNRAS.478.1784B', u'2018MNRAS.478.2835L', u'2018MNRAS.478.1763L', u'2018MNRAS.477.5167Z', u'2018ApJ...859...93L', u'2018arXiv180603136C', u'2018arXiv180606076J', u'2018ApJ...859...47A', u'2018ApJ...858L..15D', u'2018ApJ...857..131K', u'2018ApJS..235...37A', u'2018ApJ...857...11T', u'2018ApJ...855..122V', u'2018ApJ...855...14K', u'2018ApJ...854L..13H', u'2018Natur.554..207M', u'2017Sci...358.1579H', u'2017PASA...34...70X', u'2017Sci...358.1559K', u'2017PASA...34...69A'], u'solr': {u'responseHeader': {u'status': 0, u'QTime': 10, u'params': {u'sort': u'date desc', u'fq': u'{!bitset}', u'rows': u'20', u'q': u':', u'start': u'0', u'wt': u'json', u'fl': u'bibcode,alternate_bibcode'}}, u'response': {u'start': 0, u'numFound': 198, u'docs': [{u'alternate_bibcode': [u'2018arXiv180411048B', u'2018MNRAS.tmp.1098B'], u'bibcode': u'2018MNRAS.478.1784B'}, {u'alternate_bibcode': [u'2018arXiv180505482L', u'2018MNRAS.tmp.1255L'], u'bibcode': u'2018MNRAS.478.2835L'}, {u'alternate_bibcode': [u'2018arXiv180411006L', u'2018MNRAS.tmp.1077L'], u'bibcode': u'2018MNRAS.478.1763L'}, {u'alternate_bibcode': [u'2018arXiv180407060Z', u'2018MNRAS.tmp..905Z'], u'bibcode': u'2018MNRAS.477.5167Z'}, {u'alternate_bibcode': [u'2018arXiv180504951L'], u'bibcode': u'2018ApJ...859...93L'}, {u'bibcode': u'2018arXiv180603136C'}, {u'bibcode': u'2018arXiv180606076J'}, {u'alternate_bibcode': [u'2018arXiv180102617A'], u'bibcode': u'2018ApJ...859...47A'}, {u'alternate_bibcode': [u'2018arXiv180306853D'], u'bibcode': u'2018ApJ...858L..15D'}, {u'alternate_bibcode': [u'2018arXiv180303587K'], u'bibcode': u'2018ApJ...857..131K'}, {u'alternate_bibcode': [u'2018arXiv180101837A'], u'bibcode': u'2018ApJS..235...37A'}, {u'alternate_bibcode': [u'2018arXiv180209276T'], u'bibcode': u'2018ApJ...857...11T'}, {u'alternate_bibcode': [u'2017arXiv171111063V'], u'bibcode': u'2018ApJ...855..122V'}, {u'alternate_bibcode': [u'2018arXiv180109598K'], u'bibcode': u'2018ApJ...855...14K'}, {u'alternate_bibcode': [u'2017arXiv171200949H'], u'bibcode': u'2018ApJ...854L..13H'}, {u'alternate_bibcode': [u'2017arXiv171111573M'], u'bibcode': u'2018Natur.554..207M'}, {u'alternate_bibcode': [u'2017arXiv171005435H'], u'bibcode': u'2017Sci...358.1579H'}, {u'alternate_bibcode': [u'2017arXiv171108933X'], u'bibcode': u'2017PASA...34...70X'}, {u'alternate_bibcode': [u'2017arXiv171005436K'], u'bibcode': u'2017Sci...358.1559K'}, {u'alternate_bibcode': [u'2017arXiv171005846A'], u'bibcode': u'2017PASA...34...69A'}]}}, u'updates': {u'update_list': [], u'num_updated': 0, u'duplicates_removed': 0}, u'metadata': {u'num_documents': 198, u'description': u'My ADS library', u'name': u'Kaplan', u'permission': u'owner', u'id': u'AtijQpcVQomL3joNFBVn2A', u'num_users': 1, u'owner': u'kaplan', u'date_created': u'2018-07-05T18:16:21.669709', u'public': True, u'date_last_modified': u'2018-07-05T18:16:26.738949'}}

It claims that there are 198 documents, but only 20 are returned. I tried specifying the number of rows via <url>&rows=30 or params={"rows":30}, in python, but neither works: the former gives me a server error while the latter is ignored.

Is there any way around this limit?

When will a native python library for library access be available?

David

ghost commented 6 years ago

Hi David

We have some example code on Github that should be helpful: https://github.com/adsabs/ads-examples/tree/master/library_csv

It shows you a way to retrieve all contents of a library. I'm sure we'll have a more elegant programmatic way for interacting with libraries, at some point.

I hope this helps Edwin

--

Edwin Henneken ehenneken@cfa.harvard.edu NASA Astrophysics Data System IT Specialist Harvard - Smithsonian http:// http://adslabs.orgadslabs.org Center for Astrophysics http://ads.harvard.edu 60 Garden St. MS 83, Cambridge, MA 02138 Room P-129 ORCID 0000-0003-4264-2450

On Thu, Jul 5, 2018 at 2:40 PM David Kaplan notifications@github.com wrote:

I am trying to follow the logic at https://github.com/adsabs/adsabs-dev-api/blob/master/Libraries_API.ipynb (or via python at https://github.com/adsabs/adsabs-dev-api/blob/master/Converting_curl_to_python.ipynb) to get all of the bibcodes associated with a library. However, while it returns the correct number of bibcodes in the metadata, the number of actual bibcodes returned is far smaller. e.g.,

{u'documents': [u'2018MNRAS.478.1784B', u'2018MNRAS.478.2835L', u'2018MNRAS.478.1763L', u'2018MNRAS.477.5167Z', u'2018ApJ...859...93L', u'2018arXiv180603136C', u'2018arXiv180606076J', u'2018ApJ...859...47A', u'2018ApJ...858L..15D', u'2018ApJ...857..131K', u'2018ApJS..235...37A', u'2018ApJ...857...11T', u'2018ApJ...855..122V', u'2018ApJ...855...14K', u'2018ApJ...854L..13H', u'2018Natur.554..207M', u'2017Sci...358.1579H', u'2017PASA...34...70X', u'2017Sci...358.1559K', u'2017PASA...34...69A'], u'solr': {u'responseHeader': {u'status': 0, u'QTime': 10, u'params': {u'sort': u'date desc', u'fq': u'{!bitset}', u'rows': u'20', u'q': u':', u'start': u'0', u'wt': u'json', u'fl': u'bibcode,alternate_bibcode'}}, u'response': {u'start': 0, u'numFound': 198, u'docs': [{u'alternate_bibcode': [u'2018arXiv180411048B', u'2018MNRAS.tmp.1098B'], u'bibcode': u'2018MNRAS.478.1784B'}, {u'alternate_bibcode': [u'2018arXiv180505482L', u'2018MNRAS.tmp.1255L'], u'bibcode': u'2018MNRAS.478.2835L'}, {u'alternate_bibcode': [u'2018arXiv180411006L', u'2018MNRAS.tmp.1077L'], u'bibcode': u'2018MNRAS.478.1763L'}, {u'alternate_bibcode': [u'2018arXiv180407060Z', u'2018MNRAS.tmp..905Z'], u'bibcode': u'2018MNRAS.477.5167Z'}, {u'alternate_bibcode': [u'2018arXiv180504951L'], u'bibcode': u'2018ApJ...859...93L'}, {u'bibcode': u'2018arXiv180603136C'}, {u'bibcode': u'2018arXiv180606076J'}, {u'alternate_bibcode': [u'2018arXiv180102617A'], u'bibcode': u'2018ApJ...859...47A'}, {u'alternate_bibcode': [u'2018arXiv180306853D'], u'bibcode': u'2018ApJ...858L..15D'}, {u'alternate_bibcode': [u'2018arXiv180303587K'], u'bibcode': u'2018ApJ...857..131K'}, {u'alternate_bibcode': [u'2018arXiv180101837A'], u'bibcode': u'2018ApJS..235...37A'}, {u'alternate_bibcode': [u'2018arXiv180209276T'], u'bibcode': u'2018ApJ...857...11T'}, {u'alternate_bibcode': [u'2017arXiv171111063V'], u'bibcode': u'2018ApJ...855..122V'}, {u'alternate_bibcode': [u'2018arXiv180109598K'], u'bibcode': u'2018ApJ...855...14K'}, {u'alternate_bibcode': [u'2017arXiv171200949H'], u'bibcode': u'2018ApJ...854L..13H'}, {u'alternate_bibcode': [u'2017arXiv171111573M'], u'bibcode': u'2018Natur.554..207M'}, {u'alternate_bibcode': [u'2017arXiv171005435H'], u'bibcode': u'2017Sci...358.1579H'}, {u'alternate_bibcode': [u'2017arXiv171108933X'], u'bibcode': u'2017PASA...34...70X'}, {u'alternate_bibcode': [u'2017arXiv171005436K'], u'bibcode': u'2017Sci...358.1559K'}, {u'alternate_bibcode': [u'2017arXiv171005846A'], u'bibcode': u'2017PASA...34...69A'}]}}, u'updates': {u'update_list': [], u'num_updated': 0, u'duplicates_removed': 0}, u'metadata': {u'num_documents': 198, u'description': u'My ADS library', u'name': u'Kaplan', u'permission': u'owner', u'id': u'AtijQpcVQomL3joNFBVn2A', u'num_users': 1, u'owner': u'kaplan', u'date_created': u'2018-07-05T18:16:21.669709', u'public': True, u'date_last_modified': u'2018-07-05T18:16:26.738949'}}

It claims that there are 198 documents, but only 20 are returned. I tried specifying the number of rows via &rows=30 or params={"rows":30}, in python, but neither works: the former gives me a server error while the latter is ignored.

Is there any way around this limit?

When will a native python library for library access be available?

David

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/adsabs/adsabs-dev-api/issues/42, or mute the thread https://github.com/notifications/unsubscribe-auth/AHURkSV32jU3hsCtBhnImJdwGKR5X9JNks5uDl2vgaJpZM4VEXBT .

Mulan-94 commented 2 years ago

Hi @dlakaplan, did you ever get a way around this? I'm getting the same result as you and I can't seem to see any docs detailing it (or I'm not looking hard enough) ... I'm following @ehenneken 's link, but curious if this

I'm sure we'll have a more elegant programmatic way for interacting with libraries, at some point.

has been found yet :)