biocompute-objects / BCO_Documentation

Repository for documentation to support the IEEE 2791-2020 standard. Please see our home page for communications/publications:
http://biocomputeobject.org/
BSD 3-Clause "New" or "Revised" License
16 stars 12 forks source link

Knowledgebase Recommendation Text #81

Closed HadleyKing closed 2 years ago

HadleyKing commented 4 years ago

Include text about how knowledgebase can use BioCompute in the documentation.

Rahi13 commented 4 years ago

Using BioCompute's pre-defined fields and standards, knowledgebases can generate a BioCompute object to document the metadata, quality-control and integration pipelines developed for different workflows. BCO's can be generated via a user-friendly instance of a BCO editor and can be maintained and shared through versioned stable IDs stored under a single domain of that knowledgebase. BCO's not only provides complete transparency to its data submitters (authors, curators, other databases, etc), collaborators and users but also provides an efficient mechanism to reproduce the complete workflow through the information stored in different domains (such as description, execution, I/O, error, etc.) in machine and human-readable formats.

Rahi13 commented 4 years ago

@HadleyKing Let me know if you need more text.

HadleyKing commented 4 years ago

@Rahi13 this should be a good start. I want to leave it open for now though so others can comment if they have ideas

Rahi13 commented 4 years ago

@HadleyKing You can also add an example link https://data.glygen.org/DSBCO_000038/v-1.4.5 of one of the BCO's generated by GlyGen.

kee007ney commented 2 years ago

Jonathon to write a markdown for using knowledgebase BCOs in this repo.

HadleyKing commented 2 years ago

Use the following as a guide: https://github.com/biocompute-objects/extension_domain/tree/main/dataset

kee007ney commented 2 years ago

Rahi's text is fantastic.

Adding a few minor tweaks and building on it:

Using BioCompute's pre-defined fields and standards, knowledgebases can generate a BioCompute Object (BCO) to document the metadata, quality-control, and integration pipelines developed for different workflows. BCOs can be used to document each release. The structured data in a BCO makes it very easy to identify changes between releases (including changes to the curation/data processing pipeline, attribution to curators, or datasets processed), or revert to previous releases.

BCOs can be generated via a user-friendly instance of a BCO editor and can be maintained and shared through versioned, stable IDs stored under a single domain of that knowledgebase. BCOs not only provides complete transparency to its data submitters (authors, curators, other databases, etc.), collaborators, and users, but also provide an efficient mechanism to reproduce the complete workflow through the information stored in different domains (such as description, execution, io, error, etc.) in machine and human-readable formats.

The most common way of adapting BCOs for use in knowledgebases is by leveraging the Extension Domain. In this example, the Extension Domain is used for calling fields based on column headers. Note that the Extension Domain identifies its own schema, which defines the column headers and identifies them as required where appropriate. Because the JSON format of a BCO is human and machine readable (and can be further adapted for any manner of display or editing through a user interface), BCOs are amendable to either manual or automatic curation processes, such as the curation process that populates those fields in the above example.

jpat1546 commented 2 years ago

reviewed. approved.

jpat1546 commented 2 years ago

@HadleyKing we're set to push this.

HadleyKing commented 2 years ago

Need to add to the FAQ pages for now