dandi / dandi-archive

DANDI API server and Web app
https://dandiarchive.org
13 stars 13 forks source link

improve ease of citation for dandisets #1555

Open bendichter opened 1 year ago

bendichter commented 1 year ago

I am writing a proposal in Google Docs and I want to cite a dandiset. I am using paperpile to manage citations, but was not able to find dandisets based on the DOI.

image

Then I went to manually add the citation. It would be easiest to copy/paste a RIS or BibTeX, but I do not see a button for generating this an DANDI. I know BibTeX is not only used by paperpile, but by LaTeX and many citation management systems. Instead, I see a pre-formatted citation, which makes assumptions about how I want to format the citation and does not allow me to easily use a citation management system. In the end, I have to use the manual entry form of paperpile, which is onerous and error-prone:

image

I believe improvements to this workflow will lower the energy barrier for citing dandisets and could substantially increase the number of dandiset citations in peer-reviewed literature. This would in turn be a powerful statistic we could use to encourage good dandiset submissions

There are several possible improvements to this workflow:

  1. Why can't Paperpile find the DOI and automatically import the citation? Is there some registry we can join to make that possible? This would have been the easiest solution for me. It would also be useful to make it easy to copy/paste the DOI. Right now I can only find it at the end of the generated citation.
  2. Create a button to generate a RIS or BibTeX for a dandiset that can be copy/pasted into citation management software. Here are the options available for FigShare:
image
  1. Automatically generate citations in different styles. This would be useful for people who do not use citation management software. FigShare allows you to format your citation in hundreds of different styles:
image image
satra commented 1 year ago

@bendichter - did you try a proper doi with paperpile? the one in your screenshot is a fake one.

bendichter commented 1 year ago

@satra sorry, what do you mean by fake?

bendichter commented 1 year ago

Oh, are "fake" DOIs given to unpublished dandisets? I think that is pretty confusing. When I see a DOI I assume it is legitimate.

bendichter commented 1 year ago

And yes, it did (kind of) work for real DOIs:

image

However, as far as I can tell, none of the metadata of the dandiset was fetched, including title, date of publication, authors, etc.

bendichter commented 5 months ago

I just wanted to make a note here that I am currently working on a grant submission that cites a bunch of dandisets, and it is quite laborious to manually populate all of this information for each one. It would help enormously if we could generate a bibtex for automatic integration with citation management systems like bibtex.

yarikoptic commented 5 months ago

I love that idea, should be quite "trivial" to produce/update I think by simply going through all dandisets and releases and querying DOIs (unless record is already known) and at the end output'ing one gargantium BibTeX with all the entries.

Meanwhile here is the helper I use all the time http://git.oneukrainian.com/?p=etc/bash.git;a=blob;f=.bash/bashrc/30_aliases_sh;hb=HEAD#l596

doiref () {
  # get bibtex record for a doi, copy to clipboard
  doi=$(echo $1 | sed -e 's,https*://.*doi\.org/,,g')
  curl --silent -L -d "" --header "Accept: application/x-bibtex; charset=utf-8" https://doi.org/$doi \
   | sed -e 's,%2F,/,g' | xclip -i
  xclip -o
}

so I would just copy paste doi into CLI for this function and get bibtex entry to paste

bendichter commented 5 months ago

@yarikoptic interesting. Could you demonstrate how I would use this tool by applying it to a published dandiset? Let's say for example this one

bendichter commented 5 months ago

I tried to auto-fetch the citation data from the DOI using paperpile and didn't get any of the relevant metadata. It just listed that DOI as a website with no authors, publication date, etc. The title was just "DANDI Archive".

yarikoptic commented 5 months ago

clicked on clipboard icon in image pasted into terminal and got response in a sec or so

❯ doiref https://doi.org/10.48324/dandi.000897/0.240605.1710
@misc{https://doi.org/10.48324/dandi.000897/0.240605.1710,
  doi = {10.48324/DANDI.000897/0.240605.1710},
  url = {https://dandiarchive.org/dandiset/000897/0.240605.1710},
  author = {Neupane, Sujaya and Fiete, Ila and Jazayeri, Mehrdad},
  keywords = {entorhinal cortex, cognitive map, mental navigation,},
  title = {Neupane_Fiete_Jazayeri_Mental navigation_NHP_EntorhinalCortex},
  publisher = {DANDI Archive},
  year = {2024}
}

which was also already in my clipboard so could paste into .bib and tune up the ID

yarikoptic commented 5 months ago

I think we might want to improve our schema -- do you think there is a place for underscores in the title? ;-)

bendichter commented 5 months ago

oh wow that's awesome. I'm going to try this right now

bendichter commented 5 months ago

I think we might want to improve our schema -- do you think there is a place for underscores in the title? ;-)

Let's save that question for another issue

bendichter commented 5 months ago

I have an update here.

Paperpile is now able to provide some pretty good metadata given the DOI.

image

All of that was automatically pulled in from the DOI. This has improved substantially since I opened the issue.

I still like the idea of providing bibtex on the DLP, but now that this can be easily extracted from the DOI via common citations managers as well as @yarikoptic 's CLI approach, this is much lower importance to me than it was previously

bendichter commented 5 months ago

btw @yarikoptic your CLI function worked for me. Thanks!

yarikoptic commented 5 months ago

and damn you -- got me to chat to chatgpt and write some prototype , I will post it shortly, may be with some .bib to accompany...

waxlamp commented 3 months ago

@bendichter, what's a good action to take here? To me it seems like adding the ability to generate a BibTeX entry would be high-value, low-hanging fruit. Is that right? If we start with that, we can lay the groundwork for generating citations in other styles as well.

Any other ideas coming from this issue? My goal will be to close this in favor of a collection of smaller-scoped issues we can hit one by one.

satra commented 3 months ago

a cff-based widget to reformat the citation should be generally available across many sites. and cff can easily be generated based on dandiset metadata. i would say not just bibtex, but any citation style as the widget that ben posted above does.

in general this issue has two components:

  1. can we get citation metadata from doi. i believe that is confirmed to be a yes, since we use datacite.
  2. can we render that metadata in a way that users find useful (i.e. different citation styles). this is the actionable item from this issue for now.

there is a third element on user experience and how much information is there on the DLP, and what we should prioritize through a redesign of the UI elements. adding a chatbot/RAG could help there. but let's put that as a different update.

yarikoptic commented 3 months ago

+1 for BibTeX records.

Ultimately, IMHO we should aim for something like what zenodo allows: (on example of https://zenodo.org/records/11201247)

Formatting citations for cut/paste into papers directly ("Professionals" should not do that, and just go through citation managers, but it is convenient for quick / one-off use) Citations in different formats (for injecting to citation managers)
image image

Note that aforementioned interface allows for different styles of citations right away, listed in "Supported Content Types" of https://citation.crosscite.org/docs.html so IMHO worth right away considering to support at least two most common -- BibTeX and RIS. Here they are for a sample dandiset

❯ for t in application/x-bibtex application/x-research-info-systems; do curl --silent -L -d "" --header "Accept: $t; charset=utf-8" https://doi.org/10.48324/dandi.000897/0.240605.1710; done
@misc{https://doi.org/10.48324/dandi.000897/0.240605.1710,
  doi = {10.48324/DANDI.000897/0.240605.1710},
  url = {https://dandiarchive.org/dandiset/000897/0.240605.1710},
  author = {Neupane, Sujaya and Fiete, Ila and Jazayeri, Mehrdad},
  keywords = {entorhinal cortex, cognitive map, mental navigation,},
  title = {Neupane_Fiete_Jazayeri_Mental navigation_NHP_EntorhinalCortex},
  publisher = {DANDI Archive},
  year = {2024}
}
TY  - DATA
T1  - Neupane_Fiete_Jazayeri_Mental navigation_NHP_EntorhinalCortex
AU  - Neupane, Sujaya
AU  - Fiete, Ila
AU  - Jazayeri, Mehrdad
DO  - 10.48324/DANDI.000897/0.240605.1710
UR  - https://dandiarchive.org/dandiset/000897/0.240605.1710
AB  - The dataset contains electrophysiology data recorded from the entorhinal cortex of two NHPs performing a mental navigation task. The recording probes used were V-probe with 32 channels or 64 channels, manufactured by Plexon Inc. 
KW  - entorhinal cortex, cognitive map, mental navigation,
PY  - 2024
PB  - DANDI Archive
ER  - %       

Formatted text also looks good albeit needs cleaning:

❯ for t in      text/x-bibliography; do curl --silent -L -d "" --header "Accept: $t; charset=utf-8" https://doi.org/10.48324/dandi.000897/0.240605.1710; done
Neupane, S., Fiete, I., &amp; Jazayeri, M. (2024). <i>Neupane_Fiete_Jazayeri_Mental navigation_NHP_EntorhinalCortex</i> [Data set]. DANDI Archive. https://doi.org/10.48324/DANDI.000897/0.240605.1710%