muschellij2 / rscopus

Scopus Database API Interface to R
74 stars 16 forks source link

Paper ID for merging #6

Closed agbarnett closed 6 years ago

agbarnett commented 6 years ago

Would it be possible to add the Scopus paper ID to the output data.frame so that the information can be easily merged with other sources?

muschellij2 commented 6 years ago

Can you say which functions? Which do not have the paper ID? Does the API output this/

agbarnett commented 6 years ago

Sorry, the author_df function. I think the field is scopus_id, or perhaps UT.

muschellij2 commented 6 years ago

I've changed things around a bit and now you should be able to get the sufficient data. After re-installing rscopus, the following should give you the scopus_id:

res = author_df(last_name = "Muschelli", 
    first_name = "John")
res$`dc:identifier`

Note, however, some of the column names are gone from before. This is due to how I'm parsing author_data output. If you want the old way (which still doesn't have the data you want), use author_df_orig. Now, if you want all the data converted from the entries, use:

res = author_data(last_name = "Muschelli", 
    first_name = "John")
names(res$full_data)
head(res$full_data$df)
head(res$full_data$affiliation)

which can be merged using entry_number.

agbarnett commented 6 years ago

Thanks, but unfortunately scopus.id is the same as author.id, so it's the author's Scopus identifier, not the paper's identifier. As an example using

rawb = bibliometrix::retrievalByAuthorID(id='40462056100', api_key = my.api.key)$M
str(rawb[1,])

Which gives the UT column with the paper's ID

'data.frame':   1 obs. of  17 variables:
 $ AU   : chr "HAAKONSEN SMITH C.;TURBITT E.;MUSCHELLI J.;LEONARD L.;LEWIS K.;FREEDMAN B.;MURATORI M.;BIESECKER B."
 $ TI   : chr "FEASIBILITY OF COPING EFFECTIVENESS TRAINING FOR CAREGIVERS OF CHILDREN WITH AUTISM SPECTRUM DISORDER: A GENETI"| __truncated__
...
 $ UT   : chr "SCOPUS_ID:85028874240"
...
muschellij2 commented 6 years ago

Scopus ID is not the same when I run it:

res = author_df(au_id = "40462056100")
  @_fa                                                       prism:url
1 true https://api.elsevier.com/content/abstract/scopus_id/85028874240
2 true https://api.elsevier.com/content/abstract/scopus_id/85009266881
3 true https://api.elsevier.com/content/abstract/scopus_id/85013664185
4 true https://api.elsevier.com/content/abstract/scopus_id/84979950799
5 true https://api.elsevier.com/content/abstract/scopus_id/85006320610
6 true https://api.elsevier.com/content/abstract/scopus_id/84992665692
          dc:identifier                eid
1 SCOPUS_ID:85028874240 2-s2.0-85028874240
2 SCOPUS_ID:85009266881 2-s2.0-85009266881
3 SCOPUS_ID:85013664185 2-s2.0-85013664185
4 SCOPUS_ID:84979950799 2-s2.0-84979950799
5 SCOPUS_ID:85006320610 2-s2.0-85006320610
6 SCOPUS_ID:84992665692 2-s2.0-84992665692
agbarnett commented 6 years ago

Many thanks. I was still running the old version.