Closed maelle closed 4 years ago
It might even make sense for that JSON to only contain journal articles or at least not e.g. newspaper articles.
Further wish, in that JSON I'd use for the website, could you exclude the citations of the gender package?
most likely can do all that, will let you know
@maelle can you take a look at https://github.com/ropenscilabs/ropensci_citations/blob/master/citations_all_parts_clean.json when you get a chance, and see if that suits your needs.
Thanks. The one below looks weird
Also, could date be a single string, or could there be a field called year that's either a year or something like "in press"? See example below where there are two dates.
"parts": {
"author": [
{
"family": "Kang",
"given": "W."
},
{
"family": "Zhang",
"given": "M."
},
{
"family": "Wang",
"given": "Q."
},
{
"family": "Gu",
"given": "D."
},
{
"family": "Huang",
"given": "Z."
},
{
"family": "Wang",
"given": "H."
},
{
"family": "Jin",
"given": "X."
},
{
"others": true
}
],
"date": [
"2020",
"2020"
],
"title": "The SLC Family Are Candidate Diagnostic and Prognostic Biomarkers in Clear Cell Renal Cell Carcinoma",
"pages": "1–17",
"url": "https://doi.org/10.1155/2020/1932948",
"type": "article-journal",
"container-title": "BioMed Research International",
"doi": "10.1155/2020/1932948"
},
"url": "https://doi.org/10.1155/2020/1932948"
},
{
``
Actually for packages I'd like a vector, e.g. "name":["spocc", "taxize"], sorry for changing my mind.
so requests
["spocc"]
or ["spocc", "taxize"]
{
"name": "rotl",
"doi": "10.1101/2020.01.14.905901",
"citation": "Walczyńska, A., Gudowska, A., & Sobczyk, Ł. (2020). Should I shrink or should I flow? – body size adjustment to thermo-oxygenic niche. <https://doi.org/10.1101/2020.01.14.905901>",
"parts": {
"author": [
{
"family": "Walczyńska",
"given": "A."
},
{
"family": "Gudowska",
"given": "A."
},
{
"family": "Sobczyk",
"given": "Ł."
}
],
"date": "2020",
"title": "Should I shrink or should I flow? – body size adjustment to thermo-oxygenic niche",
"url": "https://doi.org/10.1101/2020.01.14.905901",
"doi": "10.1101/2020.01.14.905901"
},
"research_snippet": "body size adjustment in rotifers",
"url": "https://doi.org/10.1101/2020.01.14.905901"
}
{
"name": "plotly",
"citation": "Glanz, H., & Pileggi, S. 2018. Improving statistical communication in statistical computing courses. In M. A. Sorto, A. White, & L. Guyot (Eds.), Looking back, looking forward. Proceedings of the Tenth International Conference on Teaching Statistics (ICOTS10, July, 2018), Kyoto, Japan. Voorburg, The Netherlands: International Statistical Institute. <https://iase-web.org/icots/10/proceedings/pdfs/ICOTS10_3F1.pdf>",
"parts": {
"author": [
{
"family": "Glanz",
"given": "H."
},
{
"family": "Pileggi",
"given": "S."
}
],
"date": [
"2018",
"2018-07"
],
"title": "Improving statistical communication in statistical computing courses",
"url": "https://iase-web.org/icots/10/proceedings/pdfs/ICOTS10_3F1.pdf",
"type": "paper-conference",
"container-title": "Looking back, looking forward. Proceedings of the Tenth International Conference on Teaching Statistics (ICOTS10",
"location": "Kyoto, Japan. Voorburg, The Netherlands",
"publisher": "International Statistical Institute",
"editor": [
{
"family": "Sorto",
"given": "M.A."
},
{
"family": "White",
"given": "A."
},
{
"family": "Guyot",
"given": "L."
}
]
},
"url": "https://iase-web.org/icots/10/proceedings/pdfs/ICOTS10_3F1.pdf"
},
{
"name": "UCSCXenaTools",
"doi": "10.1155/2020/1932948",
"citation": "Kang, W., Zhang, M., Wang, Q., Gu, D., Huang, Z., Wang, H., … Jin, X. (2020). The SLC Family Are Candidate Diagnostic and Prognostic Biomarkers in Clear Cell Renal Cell Carcinoma. BioMed Research International, 2020, 1–17. <https://doi.org/10.1155/2020/1932948>",
"parts": {
"author": [
{
"family": "Kang",
"given": "W."
},
{
"family": "Zhang",
"given": "M."
},
{
"family": "Wang",
"given": "Q."
},
{
"family": "Gu",
"given": "D."
},
{
"family": "Huang",
"given": "Z."
},
{
"family": "Wang",
"given": "H."
},
{
"family": "Jin",
"given": "X."
},
{
"others": true
}
],
container-title
) - may need to curate that manuallyWhen theres others
I'd do et al. in italics. When there's more than 2 authors at all could go to et al.
Author names are shown in the random subset of cards and in the table: https://roweb3-hugo.netlify.app/citations/ Good point about et al.
@maelle made fixes
year
field. Is it okay if missing dates = the year
field is missing? If date is "in press" - then the year
field has that string.container-title
)see updated citations_all_parts_clean.json flie
Thank you!!
I don't see the year field in https://ropenscilabs.github.io/ropensci_citations/citations_all_parts.json actually?
because that is wrong file, well done me...
How is data parsed btw? An internal to document :wink:
So in summary
{
"name": "rotl",
"doi": "10.1111/nph.16361",
"citation": "Godfrey, J. M., Riggio, J., Orozco, J., Guzmán‐Delgado, P., Chin, A. R. O., & Zwieniecki, M. A. (2020). Ray fractions and carbohydrate dynamics of tree species along a 2750 m elevation gradient indicate climate response, not spatial storage limitation. New Phytologist, 225(6), 2314–2330. <https://doi.org/10.1111/nph.16361>",
"parts": {
"author": [
{
"literal": "Godfrey, J. M., Riggio, J., Orozco, J., Guzmán‐Delgado, P., Chin, A. R. O., & Zwieniecki, M. A."
}
],
"date": "2020",
"title": "Ray fractions and carbohydrate dynamics of tree species along a 2750 m elevation gradient indicate climate response, not spatial storage limitation",
"volume": "225",
"pages": "2314–2330",
"url": "https://doi.org/10.1111/nph.16361",
"type": "article-journal",
"container-title": "New Phytologist",
"issue": "6",
Thanks again!
How is data parsed btw?
what do you mean?
I'll filter out those with no dates.
For authors, do you want the string just as you put above as a hash inside an array, or instead as a string to author key?
"author": "Godfrey, J. M., Riggio, J., Orozco, J., Guzmán‐Delgado, P., Chin, A. R. O., & Zwieniecki, M. A."
Okay, packages always as an array
I mean technically, what do you use to parse the citations data. :+1:
I don't understand the questions regarding authors.
Shouldn't the citation below be excluded? (it's from an online newspaper, not a scientific journal)
{
"name": "ropenaq",
"citation": "Munkhbat, Oyungerel. 2017. Putting a magnifying glass on air pollution. The UB Post. <http://theubpost.mn/2017/01/12/putting-a-magnifying-glass-on-air-pollution>",
"parts": {
"author": [
{
"family": "Munkhbat",
"given": "Oyungerel"
}
],
"date": "2017",
"title": "Putting a magnifying glass on air pollution",
"url": "http://theubpost.mn/2017/01/12/putting-a-magnifying-glass-on-air-pollution",
"type": "article-journal",
"container-title": "The UB Post"
},
"url": "http://theubpost.mn/2017/01/12/putting-a-magnifying-glass-on-air-pollution",
"year": "2017"
}
``
:wave: @sckott
note that the current page looks ok https://roweb3-hugo.netlify.app/citations/ minus the exceptions that further improvements of the dataset will fix. So I don't consider this a show stopper for website launch.
the authors question- do you want this
"author": [
{
"literal": "Godfrey, J. M., Riggio, J., Orozco, J., Guzmán‐Delgado, P., Chin, A. R. O., & Zwieniecki, M. A."
}
]
or this
"author": "Godfrey, J. M., Riggio, J., Orozco, J., Guzmán‐Delgado, P., Chin, A. R. O., & Zwieniecki, M. A."
that one citation is now excluded
filtered out citations where year=NA
Reg authors the second format seems easier to deal with. Could we add a rule that above ? authors the other authors are "et al" so that it might look good on https://roweb3-hugo.netlify.app/citations/?
@maelle okay, updated. author now a string, and using et al. for more than 2 authors, just using last names only
Awesome. One last thing, "name" is still not an array in all cases? See e.g. https://github.com/ropenscilabs/ropensci_citations/blob/b974a97aacf208879ccefd78e5f0d647b8209878/citations_all_parts_clean.json#L11944
I tried to make things work with name sometimes a string, sometimes an array, but it'll be easier if it's always an array, sorry.
ok
name always an array now
Thank you!! And thanks for your patience creating this dataset and dealing with my requests!!
Thanks @sckott for https://raw.githubusercontent.com/ropenscilabs/ropensci_citations/master/citations_all_parts.json
I'd need a slightly different dataset to make things easier for the website
I'd need one entry by paper, where name is e.g. "package1, package2" when there are several packages (a string directly).
I'd need a field called url, and the dataset would only contain papers with URLs. An URL can be the DOI URL. At the moment some URLs are in notes.
Would it be possible for you to generate this dataset?
Thanks in advance! :pray: