psychoinformatics-de / studyforrest-data

DataLad superdataset of all studyforrest.org project dataset components
https://studyforrest.org
8 stars 2 forks source link

Host parallel-processing workflow results #17

Closed adswa closed 3 years ago

adswa commented 3 years ago

We have preprocessed structural studyforrest data with fmriprep using the pipeline we want to describe in an upcoming manuscript. It would be good to host this as a tutorial (gladly with annexed result data via datapub) here in psyinf.

adswa commented 3 years ago

I will create two datasets: One for the processing workflow collection of scripts, and one for the processed data. I'll link the processed data as a subdataset, and will publish it to Gin, too.

adswa commented 3 years ago

Data are here: https://github.com/psychoinformatics-de/processing-workflow-tutorial Workflow tutorial is here: https://github.com/psychoinformatics-de/processing-workflow

adswa commented 3 years ago

data path on juseless: /data/group/psyinf/processing-workflow-tutorial

adswa commented 3 years ago

TODO:

adswa commented 3 years ago

@christian-monch can you show me how to use your helper to generate a datacite.yml?

adswa commented 3 years ago

Looked at the datacite.yml again - we can input the DOI of a publication. Given that this would be useful, let's not assign a DOI to this dataset right now, but when we have a doi for the manuscript preprint.

For record, here is how the datacite looks so far:

# Metadata for DOI registration according to DataCite Metadata Schema 4.1.
# For detailed schema description see https://doi.org/10.5438/0014

## Required fields

# The main researchers involved. Include digital identifier (e.g., ORCID)
# if possible, including the prefix to indicate its type.
authors:
  -
    firstname: "Adina"
    lastname: "Wagner"
    affiliation: "Research Center Juelich"
    id: "ORCID:0000-0003-2917-3450"
  -
    firstname: "Małgorzata"
    lastname: "Wierzba"
    affiliation: "Nencki Institute of Experimental Biology"
    id: "ORCID:0000-0003-0820-2662"

# A title to describe the published resource.
title: "Processing Workflow Tutorial"

# Additional information about the resource, e.g., a brief abstract.
description: |
  A tutorial for decentralized, reproducible processing with DataLad, 
  based on fMRIprep and structural studyforrest data.

# Lit of keywords the resource should be associated with.
# Give as many keywords as possible, to make the resource findable.
keywords:
  - Neuroscience
  - RDM
  - HCP
  - DataLad
  - Version Control
  - Reproducibility

# License information for this resource. Please provide the license name and/or a link to the license.
# Please add also a corresponding LICENSE file to the repository.
license:
  name: "Creative Commons Attribution 4.0 International (CC BY 4.0) "
  url: "https://creativecommons.org/licenses/by/4.0/"

## Optional Fields

# Funding information for this resource.
# Separate funder name and grant number by comma.
funding:
  - "H2020-EU.3.1.5.3, 826421DFG"
  - "NSF, 1912266"
  - "NSF, 1429999"
  - "BMBF, 01GQ1905"
  - "BMBF, 01GQ1411"

# Related publications. reftype might be: IsSupplementTo, IsDescribedBy, IsReferencedBy.
# Please provide digital identifier (e.g., DOI) if possible.
# Add a prefix to the ID, separated by a colon, to indicate the source.
# Supported sources are: DOI, arXiv, PMID
# In the citation field, please provide the full reference, including title, authors, journal etc.
references:
  -
    id: "doi:10.xxx/zzzz"
    reftype: "IsSupplementTo"
    citation: "Citation1"
  -
    id: "arxiv:mmmm.nnnn"
    reftype: "IsSupplementTo"
    citation: "Citation2"
  -
    id: "pmid:nnnnnnnn"
    reftype: "IsReferencedBy"
    citation: "Citation3"

# Resource type. Default is Dataset, other possible values are Software, DataPaper, Image, Text.
resourcetype: Dataset

# Do not edit or remove the following line
templateversion: 1.2