To expand the MC2 data model, data type-specific metadata models should be assembled and reviewed by the community via the Request for Comments process. One strategy for implementing is as follows:
identify priority metadata models (assay or data type, processing level) and key information that we want to be able to provide for the community (common pipelines/processing, common repos, common access requirements, etc. Basically, any question posed about data reuse/sharing should be considered as content to ask about during the RFC)
assemble schedule of RFC dates
hold an introductory webinar on the RFC process for the community, to discuss the guidelines, expected outcomes, rules of engagement, use case requirement, etc.
For each metadata model:
select key community members to email about the RFC
assemble version 0, based on existing models/standards (HTAN, NF, AD, CRDC, etc.)
have a sheet with prelim model and RFC document available for contributors
have the storage repo included (type-specific if possible)
record common processing methods/pipelines
for QC metrics, indicate the pipeline or tool used and the source doc or output
Some priority data types:
biospecimen
tools
bulk RNA-seq level 1 - 4
scRNA-seq level 1 - 4
10X Visium (or general spatial, if we can decide on how to put that together)
imaging/microscopy (recommend stitched images and then masks and analysis, avoiding the giant raw images where possible, unless required for secondary use)
To expand the MC2 data model, data type-specific metadata models should be assembled and reviewed by the community via the Request for Comments process. One strategy for implementing is as follows:
For each metadata model:
Some priority data types: