Closed robtlx closed 2 years ago
Managed to solve this by rewriting things around like: for i in crossref_results['message']['items']: doi = i['DOI']
But now I'm running into a different issue.. if I go along with the max results of 1000, everything is fine - but obviosuly I want more than 1000. If I do *cursor=()**, it runs for quite a while but then I get a "TypeError: list indices must be integers or slices, not str" for the first line (for i in crossref_results).
I tried printing the iterated element ("i" or "doi" in my case) but it doesn't - just hits me with this error.
Is anything possible?
thanks for your question
What Python version are you using? And what habanero version? You can get the habanero version like
import habanero
habanero.__version__
So if you run the below example, you get a key error?
from habanero import Crossref
cr = Crossref()
x = cr.works(filter = {'has_full_text': True})
[z['DOI'] for z in x['message']['items']]
If the above works for you, please share the full example so I can see why you are getting the error.
Yes, using cursor pagination will take a while if you are not filtering the query in any way since there are a lot of records to page through.
The docs you linked to has an example of how to work through the results from using cursor, see the example under the heading "# Deep paging, using the cursor parameter"
Thank you for the reply and sorry to bother!
I'm on Habanero 1.2.2 and tried it on two different machines running Python 3.6 and 3.8.
I managed to work around the first version but tried running the code snippet you asked and it's now not giving any errors - just flagging the second statement as having no effect (I'm using PyCharm CE). It's not returning anything, either. Also, I am not interested in browsing through all full text publications - but more in searching for DOIs I already have. But I somehow worked around that by figuring out a different approach to the checking and am now looking at a more general level - specifically ISSNs, and I managed to succeed through looping through my ISSNs and querying cr.journals(ids='
Thank you again for the help!
Great, nice work figuring it out
Hello!
How would I go about in extracting the DOI from a query result? I tried a variant from here but I get a KeyError on 'DOI' in [ z['DOI'] for z in x['message']['items'] ] and I don't really know how to proceed.
I tried converting the query results to a dataframe but that gives me most of the results under one single parameter instead of splitting them more tidily.
I'm still a beginner in Python so please keep in mind some terms might be confusing.
My endgame is to get a column of DOIs which I can then compare to another column I've already generated - seeing what relevant journals I haven't collected already.
Thank you!