cambialens / lens-api-doc

10 stars 5 forks source link

Pulling Data from API #38

Closed Seokminoh-am closed 3 years ago

Seokminoh-am commented 3 years ago

Is it possible for me to get the data from the API even if the variable defined is missing for some articles? For instance, when I run my code, it seems that it would run and export the dataset in a csv file for variables such as lens_id which is defined for all research papers. However, once I start including things like author.affiliation.name.exact or author.first.name, the code runs into an error. This is huge for me as there are many variables that I would need from the server. I can also attach the code for your reference. Thank you.

AaronBallagh commented 3 years ago

Hello Seokminoh-am, It is possible to retrieve results that do not have specified metadata fields, but it really depends on your query. Please do attach the code to help us better understand your use case.

Seokminoh-am commented 3 years ago

Here is my code. Thank you so much.

On Mon, Apr 26, 2021 at 4:06 PM AaronBallagh @.***> wrote:

Hello Seokminoh-am, It is possible to retrieve results that do not have specified metadata fields, but it really depends on your query. Please do attach the code to help us better understand your use case.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cambialens/lens-api-doc/issues/38#issuecomment-827196464, or unsubscribe https://github.com/notifications/unsubscribe-auth/AND4UU5X4OJF4FBQCLQVK3TTKXWWRANCNFSM43P2P2UQ .

AaronBallagh commented 3 years ago

Hi Seokminoh-am, it doesn't look like the code was attached to your reply. I believe you will need to attach files to the comment on GitHub rather than replying by email with the attachment.

Seokminoh-am commented 3 years ago

Sorry for that. I will do that right now.

On Mon, Apr 26, 2021 at 9:45 PM AaronBallagh @.***> wrote:

Hi Seokminoh-am, it doesn't look like the code was attached to your reply. I believe you will need to attach files to the comment on GitHub rather than replying by email with the attachment.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cambialens/lens-api-doc/issues/38#issuecomment-827308693, or unsubscribe https://github.com/notifications/unsubscribe-auth/AND4UU6CJM4JVQGINSMTZBTTKY6OLANCNFSM43P2P2UQ .

Seokminoh-am commented 3 years ago

I hope that this works. Please let me know! Lens-Update-2.pdf

AaronBallagh commented 3 years ago

Thanks Seokminoh-am, that worked. It looks like you are using the request fields in the include fields in your query rather than the response fields. So if you change the included fields to use the response field names, it will give you the results whether the records have values for your included fields or not.

Seokminoh-am commented 3 years ago

I see thank you so much! I was also wondering if there is a way for me to specify the country (ie the variable name for country funding and country of the publisher is the same)?

Let me make those fixes and see if the code runs.

On Apr 26, 2021, at 11:45 PM, AaronBallagh @.***> wrote:

 Thanks Seokminoh-am, that worked. It looks like you are using the request fields in the include fields in your query rather than the response fields. So if you change the included fields to use the response field names, it will give you the results whether the records have values for your included fields or not.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

Seokminoh-am commented 3 years ago

By the way - is this API down? Just unable to access it for today

Sent from my iPad

On Apr 26, 2021, at 11:45 PM, AaronBallagh @.***> wrote:

 Thanks Seokminoh-am, that worked. It looks like you are using the request fields in the include fields in your query rather than the response fields. So if you change the included fields to use the response field names, it will give you the results whether the records have values for your included fields or not.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

rosharma9 commented 3 years ago

@Seokminoh-am , API is up. Can you share your endpoint and request if possible?

Seokminoh-am commented 3 years ago

As of now, I am getting error 429. Is there a way around that? Thank you

Sent from my iPad

On Apr 27, 2021, at 11:04 PM, rosharma9 @.***> wrote:

 @Seokminoh-am , API is up. Can you share your endpoint and request if possible?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

rosharma9 commented 3 years ago

The 429 is the access rate controlling status code. Once you get rate-limited, you will need to wait (sleep the code execution) for some time. You can find details about it here: https://docs.api.lens.org/getting-started.html#rate-limiting Also, we have added code samples to show how to handle such scenario: https://docs.api.lens.org/samples-scholar.html#python---cursor-based-pagination

Regarding the countries, funding countries are different than the affiliation country code. We will be standardising the countries to the code in future, until then here is a list of funding countries. https://github.com/cambialens/lens-api-doc/files/6153116/funding_country.txt

Seokminoh-am commented 3 years ago

Okay - sounds good. So just making sure, it would be the case that the variable name for source country would be source.country and funding country variable name would be funding.country for response fields - https://docs.api.lens.org/response-scholar.html#funding

As of now, it seems that even the sample code that uses cursor based pagination is not running - is it possible that it is because I got rate limited? I have attached my code to see if it is the rate limit that is the problem or my code that is the problem that is making it unable for me to pull data from the API. Thank you so much,

On Tue, Apr 27, 2021 at 11:30 PM rosharma9 @.***> wrote:

The 429 is the access rate controlling status code. Once you get rate-limited, you will need to wait (sleep the code execution) for some time. You can find details about it here: https://docs.api.lens.org/getting-started.html#rate-limiting Also, we have added code samples to show how to handle such scenario:

https://docs.api.lens.org/samples-scholar.html#python---cursor-based-pagination

Regarding the countries, funding countries are different than the affiliation country code. We will be standardising the countries to the code in future, until then here is a list of funding countries.

https://github.com/cambialens/lens-api-doc/files/6153116/funding_country.txt

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cambialens/lens-api-doc/issues/38#issuecomment-828183029, or unsubscribe https://github.com/notifications/unsubscribe-auth/AND4UU64YOTO2D7Q3PWB2GTTK6TPJANCNFSM43P2P2UQ .

rosharma9 commented 3 years ago

Yes, true about the countries.

The scroll example is working fine as the example.

However your projection fields are causing 400. Your code is missing that check. You can print the response if the status code is not ok to debug it:

elif response.status_code != requests.codes.ok:
    print response.json()

One suggestion, to stop the scroll when there is no more data to extract, you can use results field.

if json['results'] != 0:
    scroll(scroll_id)