openbases / openbases-python

Open Bases python helpers for https://openbases.github.io
https://openbases.github.io/openbases-python/
Other
1 stars 0 forks source link

paper references minimum fields #16

Closed vsoch closed 6 years ago

vsoch commented 6 years ago

hey @arfon ! What would you consider the minimum required fields for references (meaning present and defined?) so far I have a type (e.g., article) and title, month, and year. But we might want some additional checks, like giving a warning if missing a doi, or requiring a journal OR url? Let me know your thoughts! Just for perusing, here are the listing of fields for the current references I'm testing with. I'm not saying this is right / wrong, but rather a "real world" example:

['title', 'abstract', 'publisher', 'month', 'year']
['title', 'affiliation', 'abstract', 'journal', 'volume', 'number', 'pages', 'month', 'year', 'language']
['title', 'booktitle', 'abstract', 'month', 'year', 'howpublished', 'note']
['title', 'abstract', 'institution']
['title', 'abstract', 'journal', 'publisher', 'volume', 'month', 'year', 'keywords']
['title', 'abstract', 'howpublished', 'note']
['title', 'abstract', 'journal']
['title', 'abstract', 'journal', 'volume', 'number', 'pages', 'month', 'year']
['title', 'month', 'year', 'note', 'doi', 'url']
['title', 'year']
['title', 'affiliation', 'abstract', 'journal', 'volume', 'number', 'pages', 'month', 'year']
['title', 'publisher', 'series', 'year']
['title', 'journal', 'publisher', 'volume', 'number', 'month', 'year', 'address']
['title', 'abstract', 'journal', 'publisher', 'pages', 'doi', 'month', 'year', 'language']
['title', 'affiliation', 'abstract', 'journal', 'publisher', 'volume', 'pages', 'month', 'year', 'keywords', 'language', 'doi']
['title', 'affiliation', 'abstract', 'journal', 'publisher', 'volume', 'pages', 'month', 'year', 'doi', 'keywords', 'language']
['title', 'abstract', 'month', 'year', 'archivePrefix', 'primaryClass', 'doi', 'eprint']
['title', 'abstract', 'journal', 'publisher', 'volume', 'month', 'year', 'doi']
['title', 'abstract', 'month', 'year', 'archivePrefix', 'primaryClass', 'eprint']

And given my validation, the simple test passes for all of the above:

Testing bibliography entry Stodden2010-cu
  type: article
  title: The Scientific Method in Practice: Reproducibility in the Computational Sciences
Testing bibliography entry Ram2013-km
  type: article
  title: Git can facilitate greater reproducibility and increased transparency in science
Testing bibliography entry noauthor_2015-ig
  type: misc
  title: Docker-based solutions to reproducibility in science - Seven Bridges
Testing bibliography entry noauthor_undated-pi
  type: misc
  title: expfactory-docker
Testing bibliography entry Sochat2016-pu
  type: article
  title: The Experiment Factory: Standardizing Behavioral Experiments
Testing bibliography entry noauthor_undated-sn
  type: misc
  title: Science is in a reproducibility crisis: How do we resolve it?
Testing bibliography entry Baker_undated-bx
  type: article
  title: Over half of psychology studies fail reproducibility test
Testing bibliography entry Open_Science_Collaboration2015-hb
  type: article
  title: {PSYCHOLOGY}. Estimating the reproducibility of psychological science
Testing bibliography entry vanessa_sochat_2017_1059119
  type: misc
  title: {expfactory/expfactory: The Experiment Factory (v3.0) Release}
Testing bibliography entry McDonnell2012-ns
  type: misc
  title: psiTurk (Version 1.02)[Software]. New York, {NY}: New York University
Testing bibliography entry De_Leeuw2015-zw
  type: article
  title: jsPsych: a {JavaScript} library for creating behavioral experiments in a Web browser
Testing bibliography entry Smith2005-kg
  type: book
  title: Virtual Machines: Versatile Platforms for Systems and Processes
Testing bibliography entry Merkel2014-da
  type: misc
  title: Docker: Lightweight Linux Containers for Consistent Development and Deployment
Testing bibliography entry Ali2016-rh
  type: article
  title: The Case for Docker in Multicloud Enabled Bioinformatics Applications
Testing bibliography entry Moreews2015-dy
  type: article
  title: {BioShaDock}: a community driven bioinformatics shared Docker-based tools registry
Testing bibliography entry Belmann2015-eb
  type: article
  title: Bioboxes: standardised containers for interchangeable bioinformatics software
Testing bibliography entry Boettiger2014-cz
  type: article
  title: An introduction to Docker for reproducible research, with examples from the {R} environment
Testing bibliography entry Santana-Perez2015-wo
  type: article
  title: Towards Reproducibility in Scientific Workflows: An {Infrastructure-Based} Approach
Testing bibliography entry Wandell2015-yt
  type: article
  title: Data management to support reproducible research
arfon commented 6 years ago

I don’t know to be honest. In the Whedon gem we use a BibTeX library that validates these entries. Perhaps you could do the same?

On 9/27/18, 8:04 AM, "Vanessa Sochat" notifications@github.com wrote:

hey 
@arfon <https://github.com/arfon> ! What would you consider the minimum required fields for references (meaning present and defined?) so far I have a type (e.g., article) and title, month, and year. But we might want some additional checks, like giving a warning if missing a doi,
 or requiring a journal OR url? Let me know your thoughts! Just for perusing, here are the listing of fields for the current references I'm testing with. I'm not saying this is right / wrong, but rather a "real world" example:
['title', 'abstract', 'publisher', 'month', 'year']
['title', 'affiliation', 'abstract', 'journal', 'volume', 'number', 'pages', 'month', 'year', 'language']
['title', 'booktitle', 'abstract', 'month', 'year', 'howpublished', 'note']
['title', 'abstract', 'institution']
['title', 'abstract', 'journal', 'publisher', 'volume', 'month', 'year', 'keywords']
['title', 'abstract', 'howpublished', 'note']
['title', 'abstract', 'journal']
['title', 'abstract', 'journal', 'volume', 'number', 'pages', 'month', 'year']
['title', 'month', 'year', 'note', 'doi', 'url']
['title', 'year']
['title', 'affiliation', 'abstract', 'journal', 'volume', 'number', 'pages', 'month', 'year']
['title', 'publisher', 'series', 'year']
['title', 'journal', 'publisher', 'volume', 'number', 'month', 'year', 'address']
['title', 'abstract', 'journal', 'publisher', 'pages', 'doi', 'month', 'year', 'language']
['title', 'affiliation', 'abstract', 'journal', 'publisher', 'volume', 'pages', 'month', 'year', 'keywords', 'language', 'doi']
['title', 'affiliation', 'abstract', 'journal', 'publisher', 'volume', 'pages', 'month', 'year', 'doi', 'keywords', 'language']
['title', 'abstract', 'month', 'year', 'archivePrefix', 'primaryClass', 'doi', 'eprint']
['title', 'abstract', 'journal', 'publisher', 'volume', 'month', 'year', 'doi']
['title', 'abstract', 'month', 'year', 'archivePrefix', 'primaryClass', 'eprint']
And given my validation, the simple test passes for all of the above:
Testing bibliography entry Stodden2010-cu
  type: article
  title: The Scientific Method in Practice: Reproducibility in the Computational Sciences
Testing bibliography entry Ram2013-km
  type: article
  title: Git can facilitate greater reproducibility and increased transparency in science
Testing bibliography entry noauthor_2015-ig
  type: misc
  title: Docker-based solutions to reproducibility in science - Seven Bridges
Testing bibliography entry noauthor_undated-pi
  type: misc
  title: expfactory-docker
Testing bibliography entry Sochat2016-pu
  type: article
  title: The Experiment Factory: Standardizing Behavioral Experiments
Testing bibliography entry noauthor_undated-sn
  type: misc
  title: Science is in a reproducibility crisis: How do we resolve it?
Testing bibliography entry Baker_undated-bx
  type: article
  title: Over half of psychology studies fail reproducibility test
Testing bibliography entry Open_Science_Collaboration2015-hb
  type: article
  title: {PSYCHOLOGY}. Estimating the reproducibility of psychological science
Testing bibliography entry vanessa_sochat_2017_1059119
  type: misc
  title: {expfactory/expfactory: The Experiment Factory (v3.0) Release}
Testing bibliography entry McDonnell2012-ns
  type: misc
  title: psiTurk (Version 1.02)[Software]. New York, {NY}: New York University
Testing bibliography entry De_Leeuw2015-zw
  type: article
  title: jsPsych: a {JavaScript} library for creating behavioral experiments in a Web browser
Testing bibliography entry Smith2005-kg
  type: book
  title: Virtual Machines: Versatile Platforms for Systems and Processes
Testing bibliography entry Merkel2014-da
  type: misc
  title: Docker: Lightweight Linux Containers for Consistent Development and Deployment
Testing bibliography entry Ali2016-rh
  type: article
  title: The Case for Docker in Multicloud Enabled Bioinformatics Applications
Testing bibliography entry Moreews2015-dy
  type: article
  title: {BioShaDock}: a community driven bioinformatics shared Docker-based tools registry
Testing bibliography entry Belmann2015-eb
  type: article
  title: Bioboxes: standardised containers for interchangeable bioinformatics software
Testing bibliography entry Boettiger2014-cz
  type: article
  title: An introduction to Docker for reproducible research, with examples from the {R} environment
Testing bibliography entry Santana-Perez2015-wo
  type: article
  title: Towards Reproducibility in Scientific Workflows: An {Infrastructure-Based} Approach
Testing bibliography entry Wandell2015-yt
  type: article
  title: Data management to support reproducible research
—
You are receiving this because you were mentioned.
Reply to this email directly, 
view it on GitHub <https://github.com/openbases/openbases-python/issues/16>, or 
mute the thread <https://github.com/notifications/unsubscribe-auth/AAARg8zYELGAqhrO89EgJCEUpQphYKJ8ks5ufL7DgaJpZM4W8gxU>.
{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/openbases/openbases-python","title":"openbases/openbases-python","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/email/message_cards/header.png","avatar_image_url":"https://assets-cdn.github.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/openbases/openbases-python"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"paper references minimum fields (#16)"}],"action":{"name":"View Issue","url":"https://github.com/openbases/openbases-python/issues/16"}}}[
{
"@context": "http://schema.org",
"@type": "EmailMessage",
"potentialAction": {
"@type": "ViewAction",
"target": "https://github.com/openbases/openbases-python/issues/16",
"url": "https://github.com/openbases/openbases-python/issues/16",
"name": "View Issue"
},
"description": "View this Issue on GitHub",
"publisher": {
"@type": "Organization",
"name": "GitHub",
"url": "https://github.com"
}
},
{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"hideOriginalBody": "false",
"originator": "AF6C5A86-E920-430C-9C59-A73278B5EFEB",
"title": "paper references minimum fields (#16)",
"sections": [
{
"text": "",
"activityTitle": "**Vanessa Sochat**",
"activityImage": "https://assets-cdn.github.com/images/email/message_cards/avatar.png",
"activitySubtitle": "@vsoch",
"facts": [
{
"name": "Repository: ",
"value": "openbases/openbases-python"
},
{
"name": "Issue #: ",
"value": 16
}
]
}
],
"potentialAction": [
{
"name": "Add a comment",
"@type": "ActionCard",
"inputs": [
{
"isMultiLine": true,
"@type": "TextInput",
"id": "IssueComment",
"isRequired": false
}
],
"actions": [
{
"name": "Comment",
"@type": "HttpPOST",
"target": "https://api.github.com",
"body": "{\n\"commandName\": \"IssueComment\",\n\"repositoryFullName\": \"openbases/openbases-python\",\n\"issueId\": 16,\n\"IssueComment\": \"{{IssueComment.value}}\"\n}"
}
]
},
{
"name": "Close issue",
"@type": "HttpPOST",
"target": "https://api.github.com",
"body": "{\n\"commandName\": \"IssueClose\",\n\"repositoryFullName\": \"openbases/openbases-python\",\n\"issueId\": 16\n}"
},
{
"targets": [
{
"os": "default",
"uri": "https://github.com/openbases/openbases-python/issues/16"
}
],
"@type": "OpenUri",
"name": "View on GitHub"
},
{
"name": "Unsubscribe",
"@type": "HttpPOST",
"target": "https://api.github.com",
"body": "{\n\"commandName\": \"MuteNotification\",\n\"threadId\": 384961620\n}"
}
],
"themeColor": "26292E"
}
]
vsoch commented 6 years ago

yeah! The bibtex library I use validates the formatting (I caught several errors in the paper that was approved for JOSS, oups!) but I'm wondering about human validation. For example, lots of papers have missing dates that render as null (or similar) and it would be good to have some simple logic like, the minimum criteria for a web page is a url, a journal article needs the name of the journal, etc. Do you have thoughts about that?

arfon commented 6 years ago

It’s a good idea for sure. The challenge I have is that I’m no bibtex expert :-/

More generally, any rules you might end up generating here would be good to capture in the Whedon bot too so we can do those checks on submissions that don’t use this tooling.

On Thu, Sep 27, 2018 at 7:57 PM -0400, "Vanessa Sochat" notifications@github.com<mailto:notifications@github.com> wrote:

yeah! The bibtex library I use validates the formatting (I caught several errors in the paper that was approved for JOSS, oups!) but I'm wondering about human validation. For example, lots of papers have missing dates that render as null (or similar) and it would be good to have some simple logic like, the minimum criteria for a web page is a url, a journal article needs the name of the journal, etc. Do you have thoughts about that?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/openbases/openbases-python/issues/16#issuecomment-425278510, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAARgw7qhVLWw0GtdbPmt1yXbeCFEvM3ks5ufWXbgaJpZM4W8gxU.

vsoch commented 6 years ago

okey doke, let me do some more digging then to see if there is some standard somewhere (I'm sure there is) for fields that "should be there" for different types of publications. I'll keep the issue open here for further discussion.

vsoch commented 6 years ago

Closing issue so @arfon is not bothered - I'll handle this.