cancerDHC / ccdhmodel

CRDC-H model in LinkML, developed by the Center for Cancer Data Harmonization (CCDH)
https://cancerdhc.github.io/ccdhmodel/
BSD 3-Clause "New" or "Revised" License
16 stars 8 forks source link

Add content to 'home' page at /ccdhmodel/v0.2/home/ #28

Closed monicacecilia closed 3 years ago

monicacecilia commented 3 years ago

The page https://cancerdhc.github.io/ccdhmodel/v0.2/home/ is empty. A request has been made to populate with introductory information about CRDC-H. I would like to add the information shown below so that it is displayed in the 'home' page.

Is the information in the file https://github.com/cancerDHC/ccdhmodel/blob/main/src/docs/home.md automatically generated? Or will manually editing and creating a PR be sufficient? And if so, will it also be persistent to other versions?


The Harmonized CRDC Data Model (CRDC-H)

The goal of the Center for Cancer Data Harmonization (CCDH) is to support the harmonization of equivalent data elements in disparate models across NCI’s Cancer Research Data Commons (CRDC) Repositories (nodes) to enable cross-node querying and multi-modal analytics. Individual nodes’ data models have been developed largely independently to fit specific data types and/or use cases. The CCDH is tasked with defining a shared data model for use across the CRDC, leveraging existing standards where possible to support interoperability with external data.

The CCDH Harmonized Data Model (CRDC-H) and its terminological infrastructure are being designed to meet the needs of systems like the Cancer Data Aggregator (CDA) that support integrated search and metadata-based analyses across datasets in the CRDC. We view the CRDC-H as a continuously-evolving artifact. To become and remain useful, the CRDC-H must be able to evolve and extend to meet new needs, while at the same time representing a constant semantic anchor for existing content.

The version 1.0 release of the CRDC-H is a point in time along that model evolution, covering administrative, biospecimen, and clinical data entities from multiple data commons; namely, PDC, GDC, ICDC, and HTAN. The CRDC-H is natively expressed in the LinkML modeling language, allowing us to leverage the existing LinkML tool ecosystem, which includes tools for generating a number of useful artifacts, including model documentation, representations of the model in CSV and OWL, representations used for validating data such as JSON Schema and ShEx, and artifacts for interfacing with other technologies such as GraphQL and JSON-LD. The CRDC-H model repository contains tools for converting the spreadsheets where CRDC-H content is developed into formal LinkML, and holds the resulting LinkML model and its downstream artifacts for public use. By locating the CRDC-H LinkML model here, we can also leverage GitHub tools such as issue tracking and pull requests to provide versioning and maintain a history of changes to the model over time.

gaurav commented 3 years ago

Thanks so much! I was going to copy and paste this text into the file, but if you want to open a PR to edit https://github.com/cancerDHC/ccdhmodel/blob/main/src/docs/home.md, that would be fantastic! That's exactly where this content needs to go, and it will persist to other versions.

If you could come up with some text for https://github.com/cancerDHC/ccdhmodel/blob/main/src/docs/credits.md, that would be great too! I imagine we'd want to cite our grants and provide a link back to our homepage. Does anything else need to go there?

monicacecilia commented 3 years ago

I'm on it! 🚀