IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
859 stars 481 forks source link

Development of metadata-export feature is very Important! #6471

Open slametriyanto opened 4 years ago

slametriyanto commented 4 years ago

Research in Scientometrics or Bibliometrics fields has many aspects to be measured, e.g: co-authorship pattern, keyword occurrence, social network analysis, co-citation pattern, co-word analysis, etc.

In some research paper that I have read, many researchers just using metadata being exported from Scopus or Springerlink databases in various format like RIS, CSV, BibTex and Plain Text.

Will be an achievement supposing Dataverse Project be ready to develop a feature wich similar to Scopus. The search result can be exported in a specific format, and then that file is analyzed using VOSViewer or WEKA (this software is open source). I think, there has never been an application which provides that feature. Zenodo, Dryad, or CKAN are data repository software alike Dataverse.

search-result

scopus-export

vosviewer-1

vosviewer-2

vosviewer-3

vosviewer-4

jggautier commented 4 years ago

Just some thoughts and initial questions:

I've only played with VOSViewer, which your last four screenshots are showing, but I can see there are multiple formats that the tool can take, like RIS and Endnote files. Through the Dataverse UI you can export RIS and Endnote files for datasets individually, but we'd need a way to let people use the UI to download the citation files of multiple datasets. We'd need to figure out how people would decide which datasets to include -- would it be the citation files of datasets in the results of a particular search, similar to how Web of Science works, or the citation files of all datasets in a dataverse, or either?

Is there metadata that's not common in RIS and Endnote files, or that isn't in Dataverse's RIS/Endnote files, that people would want to analyze with these tools? If so, perhaps we'd have to add metadata to the Dataverse's existing exports or provide additional export formats, like a CSV file containing the values of fields that the user chooses.

A while back, an organization showed off a co-authorship network graph they made for their Dataverse repository (which I think pulls metadata from their repository on some schedule). I can't seem to find where it's discussed (I bet @pdurbin could =), but I'm wondering if the work they did to prepare the metadata for that graph could be built upon for this (so that anyone can visit a Dataverse repository, download the metadata of some collection of datasets in some format, and do bibliometrics research using one of these tools, like VOSViewer).

Another way that this tool can get you the metadata of multiple research objects is to provide a list of DOIs, and VOSViewer uses Crossref APIs to grab metadata from those DOIs (shown in one of the screenshots). I wonder if the tool's authors are interested in doing the same for DataCite DOIs (or handles)...

Lastly, Dataverse users have asked for the ability to download the metadata of multiple datasets (without needing to know how to write scripts that use Dataverse's APIs) to do their own metadata analysis and reports, and I think this is related, although the way the metadata is formatted might be different.

djbrooke commented 4 years ago

Hi @slametriyanto, is there a specific new feature that we should add to Dataverse to support metadata export? We already export dataset metadata and citation metadata in a few different formats.

poikilotherm commented 4 years ago

Would such an export be a use-case for poikilotherm/dvcli? Feel free to add to https://docs.google.com/document/d/1pyIC_h61SG8FoSmldZyW3WeElHPZyC-qR7_FWqqhDwI/edit?usp=sharing

pdurbin commented 4 years ago

A while back, an organization showed off a co-authorship network graph they made for their Dataverse repository (which I think pulls metadata from their repository on some schedule).

@jggautier you might be thinking of Collaboration Dataverse. You can find screenshots at https://groups.google.com/d/msg/dataverse-community/RTFMdFsuOfY/GWWvoBOqAAAJ and a description at http://guides.dataverse.org/en/4.19/admin/reporting-tools.html

@djbrooke it might help you (and others) to see the original conversation between me and @slametriyanto which I don't think he'll object to me putting here:

Screen Shot 2020-02-25 at 4 14 24 PM Screen Shot 2020-02-25 at 4 15 08 PM Screen Shot 2020-02-25 at 4 15 31 PM Screen Shot 2020-02-25 at 4 15 49 PM Screen Shot 2020-02-25 at 4 16 11 PM Screen Shot 2020-02-25 at 4 16 28 PM Screen Shot 2020-02-25 at 4 16 36 PM

pdurbin commented 9 months ago

Related: