Closed wvranken closed 4 years ago
Thanks for your effort on this! Can you elaborate what problems you had without using the two newly added if
's? Maybe show your example that produces the error will help since I've not had any problems yet.
Hi Zhiya,
I am downloading reference info using:
main_pi_scopus_id = '6602685472' sdf = scopus.search_author_publication(self.main_pi_scopus_id)
and then got the authors for the pubs from:
for authorScopusId in publication.authors: authorInfo = scopus.retrieve_author(authorScopusId)
This last bit of code crashes on some authors for me, and works with the two if:. It seems like it has something to do with the info returned by the SCOPUS API I think, which is not fully consistent if there is only one affiliation for example.
Best,
Wim
On 7 Sep 2018, at 23:46, Zhiya Zuo notifications@github.com wrote:
Thanks for your effort on this! Can you elaborate what problems you had without using the two newly added if's? Maybe show your example that produces the error will help since I've not had any problems yet.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
thanks for the clarification. assuming i understand your code correctly, you:
sdf
then is a data frame containing publication informationsdf
written by main_pi_scopus_id
however, authors
column in returned publication info data is a list of authors instead of one. therefore, i would write the code as follows to use retrieve_author
:
coauthors_of_main_pi = list(set([a for l in publication.authors for a in l]))
for author_id in coauthors_of_main_pi:
author_info = scopus.retrieve_author(author_id)
#todo
since the variable names are somewhat confusing (e.g., i do not know what is publication
and assume it is just sdf
), let me know if i misunderstood your problem.
On Sep 10, 2018, at 6:01 AM, wvranken notifications@github.com wrote:
Hi Zhiya,
I am downloading reference info using:
main_pi_scopus_id = '6602685472' sdf = scopus.search_author_publication(self.main_pi_scopus_id)
and then got the authors for the pubs from:
for authorScopusId in publication.authors: authorInfo = scopus.retrieve_author(authorScopusId)
This last bit of code crashes on some authors for me, and works with the two if:. It seems like it has something to do with the info returned by the SCOPUS API I think, which is not fully consistent if there is only one affiliation for example.
Best,
Wim
On 7 Sep 2018, at 23:46, Zhiya Zuo notifications@github.com wrote:
Thanks for your effort on this! Can you elaborate what problems you had without using the two newly added if's? Maybe show your example that produces the error will help since I've not had any problems yet.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zhiyzuo/python-scopus/issues/15#issuecomment-419874303, or mute the thread https://github.com/notifications/unsubscribe-auth/AHJ1XRjc9E5qN1EbVh2d8ccCYWVNrpKvks5uZkaGgaJpZM4WexUq.
It’s pseudo-code taken from a full script, but yes the publications are the rows in the sdf data frame. In any case, when calling retrieve_author() with an ID it can fail. Try with 57200697581, didn’t work for me.
On 10 Sep 2018, at 15:34, Zhiya Zuo notifications@github.com wrote:
thanks for the clarification. assuming i understand your code correctly, you:
- first search for a person’a author id (please do not use scopus id for individual authors because it may be confusing for publication scopus ids)
sdf
then is a data frame containing publication information- now you want to search for all the co-authors given the papers
sdf
written bymain_pi_scopus_id
however,
authors
column in returned publication info data is a list of authors instead of one. therefore, i would write the code as follows to useretrieve_author
:coauthors_of_main_pi = list(set([a for l in publication.authors for a in l])) for author_id in coauthors_of_main_pi: author_info = scopus.retrieve_author(author_id) #todo
since the variable names are somewhat confusing (e.g., i do not know what is
publication
and assume it is justsdf
), let me know if i misunderstood your problem.On Sep 10, 2018, at 6:01 AM, wvranken notifications@github.com wrote:
Hi Zhiya,
I am downloading reference info using:
main_pi_scopus_id = '6602685472' sdf = scopus.search_author_publication(self.main_pi_scopus_id)
and then got the authors for the pubs from:
for authorScopusId in publication.authors: authorInfo = scopus.retrieve_author(authorScopusId)
This last bit of code crashes on some authors for me, and works with the two if:. It seems like it has something to do with the info returned by the SCOPUS API I think, which is not fully consistent if there is only one affiliation for example.
Best,
Wim
On 7 Sep 2018, at 23:46, Zhiya Zuo notifications@github.com wrote:
Thanks for your effort on this! Can you elaborate what problems you had without using the two newly added if's? Maybe show your example that produces the error will help since I've not had any problems yet.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zhiyzuo/python-scopus/issues/15#issuecomment-419874303, or mute the thread https://github.com/notifications/unsubscribe-auth/AHJ1XRjc9E5qN1EbVh2d8ccCYWVNrpKvks5uZkaGgaJpZM4WexUq.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
I just tested it and it failed. can you do a pull request on this so that it’s easier for me review?
thanks for finding this bug!
On Sep 10, 2018, at 8:47 AM, wvranken notifications@github.com wrote:
It’s pseudo-code taken from a full script, but yes the publications are the rows in the sdf data frame. In any case, when calling retrieve_author() with an ID it can fail. Try with 57200697581, didn’t work for me.
On 10 Sep 2018, at 15:34, Zhiya Zuo notifications@github.com wrote:
thanks for the clarification. assuming i understand your code correctly, you:
- first search for a person’a author id (please do not use scopus id for individual authors because it may be confusing for publication scopus ids)
sdf
then is a data frame containing publication information- now you want to search for all the co-authors given the papers
sdf
written bymain_pi_scopus_id
however,
authors
column in returned publication info data is a list of authors instead of one. therefore, i would write the code as follows to useretrieve_author
:coauthors_of_main_pi = list(set([a for l in publication.authors for a in l])) for author_id in coauthors_of_main_pi: author_info = scopus.retrieve_author(author_id) #todo
since the variable names are somewhat confusing (e.g., i do not know what is
publication
and assume it is justsdf
), let me know if i misunderstood your problem.On Sep 10, 2018, at 6:01 AM, wvranken notifications@github.com wrote:
Hi Zhiya,
I am downloading reference info using:
main_pi_scopus_id = '6602685472' sdf = scopus.search_author_publication(self.main_pi_scopus_id)
and then got the authors for the pubs from:
for authorScopusId in publication.authors: authorInfo = scopus.retrieve_author(authorScopusId)
This last bit of code crashes on some authors for me, and works with the two if:. It seems like it has something to do with the info returned by the SCOPUS API I think, which is not fully consistent if there is only one affiliation for example.
Best,
Wim
On 7 Sep 2018, at 23:46, Zhiya Zuo notifications@github.com wrote:
Thanks for your effort on this! Can you elaborate what problems you had without using the two newly added if's? Maybe show your example that produces the error will help since I've not had any problems yet.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zhiyzuo/python-scopus/issues/15#issuecomment-419874303, or mute the thread https://github.com/notifications/unsubscribe-auth/AHJ1XRjc9E5qN1EbVh2d8ccCYWVNrpKvks5uZkaGgaJpZM4WexUq.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zhiyzuo/python-scopus/issues/15#issuecomment-419918035, or mute the thread https://github.com/notifications/unsubscribe-auth/AHJ1XdBtSnKYc1o1R3-1jZDKfDITDchcks5uZm2HgaJpZM4WexUq.
closing leftover issue.
the 57200697581
works well in the latest release.
I need your help rather urgently. I am collecting all out faculty publications from Scopus using API key. Though I was able to collect each authors publications, when I tried to get to co-authors_id list, using scopus.retrieve_author. I get the following error. print(scopus.retrieve_author(author_id)) File "/usr/local/lib/python3.8/dist-packages/pyscopus/scopus.py", line 144, in retrieve_author raise ValueError('Author %s not found!' %author_id) ValueError: Author 36635367700 not found!
My Code is (Partial): from pyscopus import Scopus MY_API_KEY = 'xxxxxxxxxxx' scopus = Scopus(MY_API_KEY) author_id='36635367700' print(scopus.retrieve_author(author_id))
Thanks for providing this library!
I had some problems with author retrieval, I think when there is only one publication associated and/or a single affiliation. In any case I've added a check for an affiliation information dictionary being passed as a string (line 100-101), and a type check to make sure the pandas dataframe gets a list (line 247-249), see attached file.
utils.py.gz