ropensci-archive / rorcid

:warning: ARCHIVED :warning: A programmatic interface the Orcid.org API
Other
109 stars 13 forks source link

`colnames` are not the same as the variable names - `data.frame`s in `data.frame`s #9

Closed SimonGoring closed 9 years ago

SimonGoring commented 9 years ago

I assume this is because of a mismatch in the data types (what I'm assuming they are vs. what they actually are, but this:

find.simon <- orcid(query='simon+goring')
simon.record <- orcid_id(orcid = find_simon$data$'orcid-identifier.path'[1], profile="works")
simon.papers <- simon.record[[1]][[5]][[2]][[1]] # this is where the papers are.

works okay, although it was a lot of digging to get to the papers. A helper function might be of use here. The problem comes when I want to pull out my paper titles.

> colnames(simon.papers)
 [1] "put-code"                  "work-title"                "work-citation"            
 [4] "work-type"                 "publication-date"          "work-external-identifiers"
 [7] "url"                       "work-contributors"         "work-source"              
[10] "visibility"               

works, but a bunch of these are actually data.frames, so I can't really pull out the work title (I just get the journal name if I use simon.papers$'work-title', I have to use simon.papers$'work-title'[,1].

I think it's okay, but it's certainly not intuitive to muck around in these data structures. Anyway, I'll mess around a bit more. It's a fun toy to play with!

sckott commented 9 years ago

@SimonGoring Sorry about the messiness. Their API returns deeply nested data, and I've tried to clean it up. I've tried to make parsing fast by using jsonlite C based parser, but I'll have to do it manually, but shouldn't make a big speed difference for most people.

sckott commented 9 years ago

I'll try to add some helper fxns

SimonGoring commented 9 years ago

No problem. If I can help I'd be happy to. I can see the usefulness, but right now it's hard to use, that's probably not your fault (as I mentioned in the other issue.

On Sun, Feb 8, 2015 at 8:37 PM, Scott Chamberlain notifications@github.com wrote:

@SimonGoring https://github.com/SimonGoring Sorry about the messiness. Their API returns deeply nested data, and I've tried to clean it up. I've tried to make parsing fast by using jsonlite C based parser, but I'll have to do it manually, but shouldn't make a big speed difference for most people.

— Reply to this email directly or view it on GitHub https://github.com/ropensci/rorcid/issues/9#issuecomment-73449583.

sckott commented 9 years ago

@SimonGoring can you reinstall and try again?

I tried to simplify the output a bit. e.g.,

find_simon <- orcid(query='simon goring')
id <- find_simon$data$'orcid-identifier.path'[1]
simon_record <- orcid_id(orcid = id, profile="works")
names(simon_record$`0000-0002-2700-4605`)
[1] "orcid"             "orcid-identifier"  "orcid-preferences" "orcid-history"     "type"             
[6] "group-type"        "client-type"       "works"            
head(simon_record$`0000-0002-2700-4605`$works[,c(1:3,9:12)])
  put-code       work-type visibility publication-date.year.value publication-date.month.value
1 11910891 JOURNAL_ARTICLE         NA                        2014                         <NA>
2 11910890 JOURNAL_ARTICLE         NA                        2014                         <NA>
3 11910889 JOURNAL_ARTICLE         NA                        2014                         <NA>
4 11295421 JOURNAL_ARTICLE         NA                        2013                         <NA>
5 11295423 JOURNAL_ARTICLE         NA                        2013                         <NA>
6 11295620 JOURNAL_ARTICLE         NA                        2013                         <NA>
  publication-date.day.value                    work-external-identifiers.work-external-identifier
1                       <NA> DOI, ISSN, EID, 10.1890/130017, 15409295 15409309, 2-s2.0-84894236042
2                       <NA> DOI, ISSN, EID, 10.1890/120370, 15409295 15409309, 2-s2.0-84894274301
3                       <NA> DOI, ISSN, EID, 10.1890/130001, 15409295 15409309, 2-s2.0-84894248195
4                       <NA>                         ISSN, DOI, 0022-0477, 10.1111/1365-2745.12135
5                       <NA>                         ISSN, DOI, 0022-0477, 10.1111/1365-2745.12130
6                       <NA>                          DOI, ISSN, 10.5194/cp-9-2023-2013, 1814-9332
simon_record$`0000-0002-2700-4605`$works$`work-title.title.value`
 [1] "Macrosystems ecology: Understanding ecological patterns and processes at continental scales"                                                                                                          
 [2] "Improving the culture of interdisciplinary collaboration in ecology by expanding measures of success"                                                                                                 
 [3] "Creating and maintaining high-performing collaborative research teams: The importance of diversity and interpersonal skills"                                                                          
 [4] "Pollen assemblage richness does not reflect regional plant species richness: a cautionary tale"                                                                                                       
 [5] "Linking abundances of the dung fungus Sporormiella to the density of bison: implications for assessing grazing by megaherbivores in palaeorecords"                                                    
 [6] "Holocene vegetation and climate changes in the central Mediterranean inferred from a high-resolution marine pollen record (Adriatic Sea)"                                                             
 [7] "Holocene vegetation and climate changes in central Mediterranean inferred from a high-resolution marine pollen record (Adriatic Sea)"                                                                 
 [8] "Contrasting patterns of climatic changes during the Holocene across the Italian Peninsula reconstructed from pollen data"                                                                             
 [9] "Using a Down-Scaled Bioclimate Envelope Model to Determine Long-Term Temporal Connectivity of Garry oak (Quercus garryana) Habitat in Western North America: Implications for Protected Area Planning"
[10] "Pollen-based reconstruction of Holocene vegetation and climate in southern Italy: the case of Lago Trifoglietti"                                                                                      
[11] "Paleoecological changes at Lake Cuitzeo were not consistent with an extraterrestrial impact"                                                                                                          
[12] "Deposition times in the northeastern United States during the Holocene: establishing valid priors for Bayesian age models"                                                                            
[13] "Contrasting patterns of climatic changes during the Holocene in the Central Mediterranean (Italy) reconstructed from pollen data"                                                                     
[14] "Integrating Paleoecological Databases"                                                                                                                                                                
[15] "Holocene seasonality changes in the central Mediterranean region reconstructed from the pollen sequences of Lake Accesa (Italy) and Tenaghi Philippon (Greece)"                                       
[16] "Are pollen-based climate models improved by combining surface samples from soil and lacustrine substrates?"                                                                                           
[17] "Holocene extinctions and the loss of feature diversity"                                                                                                                                               
[18] "A new methodology for reconstructing climate and vegetation from modern pollen assemblages: an example from British Columbia"                                                                         
simon_record$`0000-0002-2700-4605`$works$`work-title.subtitle.value`
 [1] NA                                                NA                                               
 [3] NA                                                "Journal of Ecology"                             
 [5] "Journal of Ecology"                              "Climate of the Past"                            
 [7] "Climate of the Past Discussions"                 "Climate of the Past"                            
 [9] "Environmental Management"                        "Climate of the Past"                            
[11] "Proceedings of the National Academy of Sciences" "Quaternary Science Reviews"                     
[13] "Climate of the Past Discussions"                 "Eos, Transactions American Geophysical Union"   
[15] "The Holocene"                                    "Review of Palaeobotany and Palynology"          
[17] NA                                                "Journal of Biogeography" 
sckott commented 9 years ago

@SimonGoring let me know if you are happy with the output now, i..e, if we can close this issue or not

SimonGoring commented 9 years ago

Yep, this seems a bit more manageable. Thanks Scott.