Closed felippemed closed 4 years ago
It would be of great help if I could store the results from query into a variable.
hello, just assign the result of the query to a variable like this:
authors= scholarly.search_author(name)
and you can iterate the variable like this:
for author in authors:
#do some stuff here with the author object
Seeing around, I realized that the result regards a "not JSON serializable" object.
For this, I've modified the source code of the Author
and Publication
class inherited from dict
instead of object
and add dict.__init__
in Author and Publication __init__
.
class Publication(dict):
"""Returns an object for a single publication"""
def __init__(self, __data, pubtype=None):
dict.__init__(self)
....
and to get JSON result, just call:
publication_object.__dict__
or you can use the different approach like here: https://stackoverflow.com/questions/3768895/how-to-make-a-class-json-serializable
Hope it helps you!
Hello Trioputrap,
I ammended the code as you suggested:
class Publication(dict):
"""Returns an object for a single publication"""
def __init__(self, __data, pubtype=None):
self.bib = dict()
self.source = pubtype
dict.__init__(self)
....
Then I run the exemplar case
publication=scholarly.search_pubs_query('Perception of physical stability and center of mass of 3D objects')
then it returned fields, correctly
print(next(publication))
###output
{'_filled': False,
'bib': {'abstract': 'Humans can judge from vision alone whether an object is '
'physically stable or not. Such judgments allow observers '
'to predict the physical behavior of objects, and hence '
'to guide their motor actions. We investigated the visual '
'estimation of physical stability of 3-D objects (shown '
'in stereoscopically viewed rendered scenes) and how it '
'relates to visual estimates of their center of mass '
'(COM). In Experiment 1, observers viewed an object near '
'the edge of a table and adjusted its tilt to the '
'perceived critical angle, ie, the tilt angle at which '
'the object …',
'author': 'SA Cholewiak and RW Fleming and M Singh',
'eprint': 'https://jov.arvojournals.org/article.aspx?articleID=2213254',
'title': 'Perception of physical stability and center of mass of 3-D '
'objects',
'url': 'https://jov.arvojournals.org/article.aspx?articleID=2213254'},
'citedby': 15,
'id_scholarcitedby': '15736880631888070187',
'source': 'scholar',
'url_scholarbib': 'https://scholar.googleusercontent.com/scholar.bib?q=info:K8ZpoI6hZNoJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAXIjCFpwk1u0XEARPUufLltWIPwQg4_P_&scisf=4&ct=citation&cd=0&hl=en'}
However, when it comes to JSON, it stil doesn't work:
json.load(publication)
Traceback (most recent call last):
File "<ipython-input-43-a51cc3f613b0>", line 1, in <module>
json.load(publication)
File "C:\ProgramData\Anaconda3\lib\json\__init__.py", line 293, in load
return loads(fp.read(),
AttributeError: 'generator' object has no attribute 'read'
I tried the methods you suggested from https://stackoverflow.com/questions/3768895/how-to-make-a-class-json-serializable# but all of them returned
AttributeError: 'generator' object has no attribute '__dict__'
Don't know what else to do.
Thanks for your help
I can see an object generated in there
scholarly.search_pubs_query(title)
Out[5]: <generator object _search_scholar_soup at 0x000001B5BDD2FE58>
But after the amendments you suggested, I tested again and print(next()) stopped working
print(next(search_query))
Traceback (most recent call last):
File "<ipython-input-7-ac5f97a46ac4>", line 1, in <module>
print(next(search_query))
StopIteration
...and JSON still cannot load
json.load(publication_object.__dict__)
Traceback (most recent call last):
File "<ipython-input-8-a2d96e60c825>", line 1, in <module>
json.load(publication_object.__dict__)
NameError: name 'json' is not defined
I tried your example step-by-step, but it keeps crashing both for Author and Publication.
json.dumps(cls=publication_object.__dict__)
Traceback (most recent call last):
File "<ipython-input-18-7f1c80f8d4e3>", line 1, in <module>
json.dumps(cls=publication_object.__dict__)
NameError: name 'publication_object' is not defined
What have I done wrong?
@felippemed
What have I done wrong?
A couple things that I noticed. First, json.load
without the plural loads
is for reading from files. But actually json.loads
is for reading from strings. In this case, we want json.dumps
.
See here for more details.
json.loads take a string as input and returns a dictionary as output. json.dumps take a dictionary as input and returns a string as output.
Here's some code that will write a json file for individual query results. (python==3.7.3, scholarly==0.2.4)
import scholarly
import json
# standard scholarly stuff
search_query = scholarly.search_pubs_query('my search query')
d = next(search_query)
# dump dict as string, load from string to json object
j = json.loads(json.dumps(d.__dict__))
# write json object as file named data.json
with open('data.json', 'w') as outfile:
json.dump(j, outfile)
If I look at the output of the code, it's a file called data.json
that looks like this:
{
"bib": {
"title": "\\u201cYour Word is my Command\\u201d: google search by voice: A case study",
"url": "https://link.springer.com/chapter/10.1007/978-1-4419-5951-5_4",
"author": "J Schalkwyk and D Beeferman and F Beaufays and B Byrne\\u2026",
"abstract": "\\u2026 types of up-to-the-minute information (\\u201cwhere's the closest parking spot?\\u201d) or communications\n(eg, \\u201cupdate my facebook status \\u2026 The maturing of powerful search engines provides a very effective\nway to give users what they want if we can recognize the words of their query \\u2026 \n",
"eprint": "https://ai.google/research/pubs/pub36340.pdf"
},
"source": "scholar",
"citedby": 272,
"id_scholarcitedby": "12354430935285135518",
"url_scholarbib": "https://scholar.googleusercontent.com/scholar.bib?q=info:nnjEo3rBc6sJ:scholar.google.com/&output=citation&scisdr=CgXVgDvOGAA:AAGBfm0AAAAAXPTZXd-J2lG3fgUgaqNWc3JsRL9dwl57&scisig=AAGBfm0AAAAAXPTZXTLvO5REmdaJgtI-6e6nEJShubdb&scisf=4&ct=citation&cd=0&hl=en",
"_filled": false
}
Edited: Sorry if this is a naive question.
The module runs amazingly well. It would be of great help if I could store the results from query into a variable.
print(next(search_query)
Seeing around, I realized that the result regards a "not JSON serializable" object.
Any help for that?