dzhw / metadatamanagement

Metadatamanagement (MDM) - Data Search for Higher Education Research and Science Studies
https://metadata.fdz.dzhw.eu
GNU Affero General Public License v3.0
25 stars 9 forks source link

As a public user I want an export function for citation hints of subDataSets #2232

Closed rbirkelbach closed 4 years ago

rbirkelbach commented 5 years ago

When clicking on the citation hints we could provide an export function for bibtex, endnote xml and other reference managers.

rreitmann commented 5 years ago

This is related to #2173.

We suggest that we implement a CitationHintGenerator that generates the citationHint automatically from the rest of our metadata. If we had this we could export it to different formats as well. However, at the moment the algorithm for creating a citationHint automatically has not been defined.

AndyDaniel1 commented 5 years ago

Let's solve #2173 first and then come back to this issues if it has already been solved.

rreitmann commented 5 years ago

@AndyDaniel1 IMHO the CitationHintGenerator would be a solution for #2173 too...

AndyDaniel1 commented 5 years ago

Ok, thanks for the feedback. Then we flip the prio of both issues

rreitmann commented 4 years ago

@rbirkelbach please provide the algorithm for computing a citation for a configured data package!

UteH commented 4 years ago

https://github.com/dzhw/FDZ_Allgemein/issues/328

rbirkelbach commented 4 years ago

@DilekIkiz proposals are: [Autorinnen der Studie] (Veröffentlichungsdatum des Datensatzes). [Titel]. Datenerhebung: [YYYY oder YYYY/YYYY]. Version: [X.Y.Z]. [Veröffentlichungsort]: [Publikationsagent]. Datenkuratierung: [Namen der Datenkuratorinnen]. doi: [Identifikator].

[Creator] ([PublicationYear]). [Title]. Data Collection [Date, dateType, Collected]: [YYYY or YYYY/YYYY]. Version: [X.Y.Z]. [geoLocationPlace]: [Publisher]. Data Curation [Contributor]: [contributorName]. doi: [Identifier].

As we want to cite the configured data package I'd propose the following:

[Autorinnen der Studie] (first release date). [Titel]. Datenerhebung: [YYYY oder YYYY-YYYY]. Version: [X.Y.Z]. Datenpaketzugangsweg: [CUF/Onsite-Suf/Remote-Suf]. [Veröffentlichungsort]: [Publikationsagent]. Datenkuratierung: [dataCurators]. doi: [Identifikator].

[Creator] ([first release date]). [Title]. Data Collection: [YYYY or YYYY-YYYY]. Version: [X.Y.Z]. Data Package Access Way: [Cuf/Onsite-Suf/Remote-Suf]. [PublicationPlace]: [Publisher]. Data Curation: [dataCurators]. doi: [Identifier].

rbirkelbach commented 4 years ago

We need a new field at the study level: dataCurators

rbirkelbach commented 4 years ago

@DilekIkiz wrote: "Da wir immer der Publisher bei einer Datenkuratierung sein werden, ist meiner Ansicht nach "Hannover: FDZ-DZHW" ein fixer/unveränderlicher Bestandteil in unserer Datenzitation."

rbirkelbach commented 4 years ago

@DilekIkiz @AndyDaniel1 : @rreitmann told me that you now do not want the accessWays. What's the reason for changing your stance? In my opinion this is together with the version number and title the most important information.

rbirkelbach commented 4 years ago

@DilekIkiz could you please respond? I still think that we should include the accessWay, as this can have impact on the analysis and then it's good to know what exact subdataset was used.

DilekIkiz commented 4 years ago

As I recall, the primary concern is not to overload the data citation. accessWays is an element that is relevant to internal processes, but not to the data user. This information can also be found in the MDM and the data usage contract. In addition, accessWays can also change and can therefore render the citation unusable. Overall, @danbfdz and Andy have therefore jointly decided not to include this information in the data citation. But I think @AndyDaniel1 can explain it to you better.

rbirkelbach commented 4 years ago

As discussed yesterday with @AndyDaniel1 we will keep the accessWay variable, as otherwise one cannot identify the level of granularity of the data in research which uses our data. Hence, we should use the following:

[Autorinnen der Studie] (first release date). [Titel]. Datenerhebung: [YYYY oder YYYY-YYYY]. Version: [X.Y.Z]. Datenpaketzugangsweg: [CUF/Onsite-SUF/Remote-SUF]. Hannover: FDZ-DZHW. Datenkuratierung: [dataCurators]. doi: [Identifikator].

[Authors of the Study] ([first release date]). [Title]. Data Collection: [YYYY or YYYY-YYYY]. Version: [X.Y.Z]. Data Package Access Way: [Cuf/Onsite-SUF/Remote-SUF]. Hanover: FDZ-DZHW. Data Curation: [dataCurators]. doi: [Identifier].

The publishers should probably be the dataCurators.

DilekIkiz commented 4 years ago

@UteH and I reviewed some details, here our suggestions:

[Autorinnen der Studie] ([first release date]). [Titel]. Datenerhebung: [YYYY oder YYYY/YYYY oder YYYY-YYYY]. Version: [X.Y.Z]. Datenpaketzugangsweg: [Download-CUF/On-Site-SUF/Remote-Desktop-SUF/Download-SUF]. Hannover: FDZ-DZHW. Datenkuratierung: [dataCurators]. doi: [Identifikator].

[Authors of the Study] ([first release date]). [Title]. Data Collection: [YYYY or YYYY/YYYY or YYYY-YYYY]. Version: [X.Y.Z]. Data Package Access Way: [Download-CUF/On-Site-SUF/Remote-Desktop-SUF/Download-SUF]. Hanover: FDZ-DZHW. Data Curation: [dataCurators]. doi: [Identifier].

@rbirkelbach -> Authors of the Study, Identifier and Authors of the Study should be the same for de/en, so we can use the same field name? (if so, feel free do edit this comment)

rbirkelbach commented 4 years ago

We need a possiblity to create a non-empty array of dataCurators. As we do not have a dataPackage domain object, we have to attach it to study. We need to migrate old versions. Should we show dataCurators on the study page? (In my opinion: yes) Should we send it to da|ra?

rbirkelbach commented 4 years ago

$study.authors ($study.release.firstDate). $study.title.de/en. Datenerhebung/Data Collection: $survey.fieldPeriod. Version: $release.version. Datenpaketzugangsweg/Data Package Access Way: $dataSet.subDataSet.accessWay. Hannover/Hanover: FDZ-DZHW. Datenkuratierung/Data Curation: $study.dataCurators. doi: $doi

rbirkelbach commented 4 years ago

@rreitmann please check whether I set the domain objects correctly.

rbirkelbach commented 4 years ago

Single Author: Brown, E. Two Authors: Soto, C. J., & John, O. P. Three to Twenty Authors: Nguyen, T., Carnevale, J. J., Scholer, A. A., Miele, D. B., & Fujita, K. More Than Twenty Authors: Pegion, K., Kirtman, B. P., Becker, E., Collins, D. C., LaJoie, E., Burgman, R., Bell, R., DelSole, R., Min, D., Zhu, Y., Li, W., Sinsky, E., Guan, H., Gottschalck, J., Metzger, E. J., Barton, N. P., Achuthavarier, D., Marshak, J., Koster, R., . . . & Kim, H. Group authors: Group Name Unknown Author: When the work does not have an author move the title of the work to the beginning of the references and follow with the date of publication. Only use “Anonymous ” if the author is the work is signed “Anonymous.”

rreitmann commented 4 years ago

Ok, let me sum up:

rreitmann commented 4 years ago

I would prefer mapping our fields to bibJSON http://okfnlabs.org/bibjson/ and use https://citation.js.org/ to present the citation hints in different technical formats and languages. However since we have decided to generate something special which uses different fields than defined in APA we need to implement our own version of APA...

rreitmann commented 4 years ago

We can experiment with citation.js here: https://runkit.com/rreitmann/5e4fad6cadc8f20013197616

Furthermore there is an upcoming standard for citation which is developed by some bigger players: https://citationstyles.org/ https://github.com/citation-style-language/schema http://docs.citationstyles.org/en/1.0.1/specification.html https://github.com/citation-style-language/schema/blob/master/csl-data.json

rbirkelbach commented 4 years ago

After we have decided on a solution we should update or deprecate this document: \faust\Abt4\FDZ\2_Standards\Zitationsanleitung\Vorlage_Citation_Guideline.dot

rreitmann commented 4 years ago

Datacite or crosscite uses CSL as well: https://datacite.org/citation.html

A simple REST request can give us citations in all citation styles defined by CSL (documentation here): curl -LH "Accept: text/x-bibliography; style=bibtex; locale=de-DE" https://doi.org/10.21249/DZHW:gra2009:1.0.1

Available styles are here: https://github.com/citation-style-language/styles

However, since the metadata registered with the doi comes via da|ra and since therefore we cannot influence the CSL-JSON which is used as input for citeproc-js, we should map our metadata to CSL-JSON ourselves...

rreitmann commented 4 years ago

And here is the wikepedia link to CSL: https://de.wikipedia.org/wiki/Citation_Style_Language

rreitmann commented 4 years ago

Ok, let me sum up again:

rreitmann commented 4 years ago

Quick migration for dev and local:

db.getCollection('studies').find({}).forEach(function(study) {
    study.dataCurators = study.authors;
    study.version = study.version + 1;
    study.lastModifiedDate = new Date();
    db.getCollection('studies').save(study);
})
rreitmann commented 4 years ago

Remove citationHint from all subDataSets:

db.getCollection('data_sets').find({}).forEach(function(dataSet) {
    if (dataSet.subDataSets) {
        dataSet.subDataSets.forEach(function(subDataSet) {
            delete subDataSet.citationHint;
        });
    }
    dataSet.version = dataSet.version + 1;
    dataSet.lastModifiedDate = new Date();
    db.getCollection('data_sets').save(dataSet);
})
UteH commented 4 years ago
rreitmann commented 4 years ago

Add dataCurators to all studies on prod:

/* global db, printjson */
'use strict';

var dataCuratorsMap = {
    'stu-bst02$': [{
      'firstName': 'Robert',
      'lastName': 'Birkelbach'
    }],
    'stu-cmp2014$': [{
      'firstName': 'Kim',
      'lastName': 'Sommer'
    },{
      'firstName': 'Sandra',
      'lastName': 'Vietgen'
    }],
    'stu-dps2017$': [{
      'firstName': 'Isabel',
      'lastName': 'Steinhardt'
    },{
      'firstName': 'Dilek',
      'lastName': 'İkiz-Akıncı'
    }],
    'stu-egr2018$': [{
      'firstName': 'N.',
      'lastName': 'N.'
    }],
    'stu-est2016$': [{
      'firstName': 'Robert',
      'lastName': 'Birkelbach'
    },{
      'firstName': 'Friederike',
      'lastName': 'Schlücker'
    }],
    'stu-gra2005$': [{
      'firstName': 'Florence',
      'lastName': 'Baillet'
    },{
      'firstName': 'Andreas',
      'lastName': 'Franken'
    },{
      'firstName': 'Anne',
      'lastName': 'Weber'
    }],
    'stu-gra2009$': [{
      'firstName': 'Florence',
      'lastName': 'Baillet'
    },{
      'firstName': 'Andreas',
      'lastName': 'Franken'
    },{
      'firstName': 'Anne',
      'lastName': 'Weber'
    }],
    'stu-gsl2008$': [{
      'firstName': 'Andreas',
      'lastName': 'Daniel'
    },{
      'firstName': 'Ute',
      'lastName': 'Hoffstätter'
    },{
      'firstName': 'Björn',
      'lastName': 'Huß'
    },{
      'firstName': 'Percy',
      'lastName': 'Scheller'
    }],
    'stu-gsl2012$': [{
      'firstName': 'Robert',
      'lastName': 'Birkelbach'
    },{
      'firstName': 'Sandra',
      'lastName': 'Vietgen'
    },{
      'firstName': 'Marten',
      'lastName': 'Wallis'
    }],
    'stu-gsl2015$': [{
      'firstName': 'Robert',
      'lastName': 'Birkelbach'
    },{
      'firstName': 'Johanna',
      'lastName': 'Niebuhr'
    },{
      'firstName': 'Sandra',
      'lastName': 'Vietgen'
    },{
      'firstName': 'Marten',
      'lastName': 'Wallis'
    }],
    'stu-hth2017$': [{
      'firstName': 'Ute',
      'lastName': 'Hoffstätter'
    },{
      'firstName': 'Elke',
      'lastName': 'Middendorff'
    }],
    'stu-lib2016$': [{
      'firstName': 'Bernd',
      'lastName': 'Kleimann'
    },{
      'firstName': 'Dilek',
      'lastName': 'İkiz-Akıncı'
    },{
      'firstName': 'Malte',
      'lastName': 'Hückstädt'
    }],
    'stu-mog2020$': [{
      'firstName': 'N.',
      'lastName': 'N.'
    }],
    'stu-nac2018$': [{
      'firstName': 'Robert',
      'lastName': 'Birkelbach'
    },{
      'firstName': 'Ute',
      'lastName': 'Hoffstätter'
    },{
      'firstName': 'Anne',
      'lastName': 'Weber'
    }],
    'stu-phd2014$': [{
      'firstName': 'Kerstin',
      'lastName': 'Lange'
    },{
      'firstName': 'Percy',
      'lastName': 'Scheller',
    },{
      'firstName': 'Sandra',
      'lastName': 'Vietgen'
    },{
      'firstName': 'Marten',
      'lastName': 'Wallis'
    }],
    'stu-rub18yo$': [{
      'firstName': 'Dilek',
      'lastName': 'İkiz-Akıncı'
    }],
    'stu-scs2016$': [{
      'firstName': 'Andreas',
      'lastName': 'Daniel'
    },{
      'firstName': 'Sahra-Rebecca',
      'lastName': 'Kienast'
    },{
      'firstName': 'Sandra',
      'lastName': 'Vietgen'
    }],
    'stu-ssy17$': [{
      'firstName': 'Elke',
      'lastName': 'Middendorff'
    },{
      'firstName': 'Ute',
      'lastName': 'Hoffstätter'
    }],
    'stu-ssy18$': [{
      'firstName': 'Elke',
      'lastName': 'Middendorff'
    },{
      'firstName': 'Ute',
      'lastName': 'Hoffstätter'
    }],
    'stu-ssy19$': [{
      'firstName': 'Ute',
      'lastName': 'Hoffstätter'
    },{
      'firstName': 'Andreas',
      'lastName': 'Sarcletti'
    }],
    'stu-ssy20$': [{
      'firstName': 'Andreas',
      'lastName': 'Daniel'
    },{
      'firstName': 'Andreas',
      'lastName': 'Sarcletti'
    },{
      'firstName': 'Sandra',
      'lastName': 'Vietgen'
    }],
    'stu-ssy21$': [{
      'firstName': 'Florence',
      'lastName': 'Baillet'
    },{
      'firstName': 'Anne',
      'lastName': 'Weber'
    }],
    'stu-tu18yo$': [{
      'firstName': 'Dilek',
      'lastName': 'İkiz-Akıncı'
    }],
    'stu-tuk18yo$': [{
      'firstName': 'Dilek',
      'lastName': 'İkiz-Akıncı'
    }],
    'stu-uzk18yo$': [{
      'firstName': 'Dilek',
      'lastName': 'İkiz-Akıncı'
    }],
    'stu-win2015$': [{
      'firstName': 'Adisa',
      'lastName': 'Beširović'
    },{
      'firstName': 'Dilek',
      'lastName': 'İkiz-Akıncı'
    },{
      'firstName': 'Thorben',
      'lastName': 'Sembritzki'
    },{
      'firstName': 'Lisa',
      'lastName': 'Thiele'
    }]
  };

Object.keys(dataCuratorsMap).forEach(function(studyId) {
    db.getCollection('studies').find({masterId: studyId}).forEach(
    function(study) {
      study.dataCurators = dataCuratorsMap[studyId];
      study.version = study.version + 1;
      study.lastModifiedDate = new Date();
      db.getCollection('studies').save(study);
    });
  });

@rbirkelbach Please review the list of dataCurators for each study id. In some cases there is neither a project overview nor a method report therefore I had to put "N. N."...

rbirkelbach commented 4 years ago

egr2018: Wir brauchen die kompletten Namen folgender Personen: Meng, C., Maurer, S., Mühleck, K., Unger, M, Oelker, S., van der Velden, R., Wessling, K phd2014: Wir brauchen die kompletten Namen von K. Lange

svietgen commented 4 years ago

@rbirkelbach K. Lange -- > Kerstin Lange Zu egr2018: dabei handelt es sich nicht um Datenkuratoren, oder?

rbirkelbach commented 4 years ago

@svietgen danke, bin mir bei egr nicht sicher. Das sollten vermutlich @McWallis @ElkeMi und @UteH denke ich wissen, da sie dafür im MDM stehen.

@rreitmann I made the following observation: Minks, K., Briedis, K., Grotheer, M., Isleib, S., & Netz, N. (2020). DZHW-Absolventenpanel 2005. Datenerhebung: 2006-2019. Version: 1.1.1. Datenpaketzugangsweg: Download-SUF. Hannover: FDZ-DZHW. Datenkuratierung: test, T., & test2, T. H.. doi: 10.17889/DZHW:gra2005:1.1.1 You see that test and test2 are lowercase, whereas T. and T.H. are capitalized. The raw input data was lower case. I personally think we should use the way the user spelled the name for all names, as in some languages parts of names can be lower case. For example in Dutch, de Leeuw is a more or less common last name and I guess the same case could be made for first names, but I cannot come up with an example.

ElkeMi commented 4 years ago

Ich stehe für dieses Projekt offenbar fälschlicherweise im MDM; kenne auch K. Lange nicht.

UteH commented 4 years ago

Elke hatte bei egr2018 Metadaten eingetragen, aber wird nicht als Datenkuratorin genommen. Ihr könnt Hoffstätter, U., Wallis, M. und Niebuhr, J. eintragen (oder ich ändere es später, ist ja noch nicht released)

rbirkelbach commented 4 years ago

I deleted the three of you and inserted N.N. This project isn't released anyways, so it doesn't hurt at the moment.