IQSS / dataverse-pm

Project management issue tracker for the Dataverse Project. Note: Related links and documents may not be public.
https://dataverse.org
0 stars 0 forks source link

NIH AIM:3 YR:2 TASK:2 | 2.3.2 | Research and discovery phase for containers and research objects support #15

Closed mreekie closed 9 months ago

mreekie commented 1 year ago

From discussions with Mahmood

The workflow work for the year 1 deliverables has been release in 5.12. What is needed is to needed is to add support for additional use cases specific to the biomedical fields.

The terms for the MVP were chosen.

The container work is also open.

Part of this sequence of deliverables: 1.3.1 | 3 | Support software metadata | 5 1.3.2 | 3 | Research and discovery phase for biomedical workflows support | 5 2.3.1 | 3 | Support biomedical workflows | 5 2.3.2 | 3 | Research and discovery phase for containers and research objects support | 5 3.3.1 | 3 | Support containers and research objects  | 10 4.3.1 | 3 | Apply container, RO, workflows support to a few NIH-funded projects | 10

┆Issue is synchronized with this Smartsheet row by Unito

mreekie commented 1 year ago

This issue represents a deliverable funded by the NIH This deliverable supports the NIH Initiative to Improve Access to NIH-funded Data

Aim 3: Support standards for sharing code, workflows, and containers

The Harvard Dataverse currently supports depositing any type of file, including code/software and documentation files that accompany data, or files within a research replication package. In this project, we plan to facilitate researchers’ efforts to share and publish their entire workflows or containers that describe the main transformations and analysis of the data, following the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. As a result, the research findings will be portable and reproducible (ideally) with a single command. Though the services will be available to any researcher, special attention will be given to the NIH-funded work. The Dataverse project has already undertaken the development of Codemeta metadata (based on the standard schema.org) within the software. The project will assess the use of Codemeta for research software code and incorporate RO-Crate (for research objects metadata), which allows high flexibility in replication package content. Further, we will explore container metadata and the use of standardized container images for research. Containerization services, including software security scanning, exist for the Harvard Medical School (HMS) O2 high performance computing cluster, are in use by a number of laboratories, and are being developed by BioGrids, a HMS partner that specializes in creating replicable biomedical software packages and containers. As part of this project, we will explore the integration of these containerization services with the Harvard Dataverse repository to support sharing, discovery, and archival of replicable biomedical research.

1.3.1 | 3 | Support software metadata | 5 1.3.2 | 3 | Research and discovery phase for biomedical workflows support | 5 2.3.1 | 3 | Support biomedical workflows | 5 2.3.2 | 3 | Research and discovery phase for containers and research objects support | 5 3.3.1 | 3 | Support containers and research objects  | 10 4.3.1 | 3 | Apply container, RO, workflows support to a few NIH-funded projects | 10

mreekie commented 1 year ago

monthly January 2023

(2.3.2) We have implemented an intermediate solution that provides the ability to run data analysis in an external container using Binder.

mreekie commented 1 year ago

Febrary 2023 update

mreekie commented 1 year ago

March Update

We recently implemented a solution that provides the ability to run data analysis in an external container using Binder. We improved the code underlying Binder (repo2docker) to allow it to download tabular data from Dataverse in its original format (e.g. Stata rather than an archive-friendly tab separated values format).

pdurbin commented 1 year ago

Febrary 2023 update

* (2.3.1) Initial meeting was held.

@mreekie any notes from the initial meeting?

cmbz commented 1 year ago

May 2023 Update: An initial meeting was held with interested stakeholders on 2023/05/05 to discuss requirements for supporting research objects beyond datasets and preliminary design approaches including an option for creating a Dataverse database entry for object types. Follow-up discussions will be planned.

cmbz commented 9 months ago

2024/01/03: Closing, work will be tracked here: https://github.com/IQSS/dataverse-pm/issues/146