Closed saeid-p closed 7 years ago
We agree that the repository number is limited currently. While we are working to increase DataMeds’ content, please do bear in mind that DataMed is currently a prototype and will not in its present form encompass all data resources yet.
We also have felt the requirement for a data type ontology. The bioCADDIE team has initiated work in this area. We would like to clarify that the metadata model built by the bioCADDIE team is based on various use cases (provided by multiple stakeholders) and encompassing existing models in use by the community. Please see https://biocaddie.org/group/working-group/working-group-3-descriptive-metadata-datasets for a detailed description. The model is currently being refined based on feedback from the core technology development team as it gets implemented for multiple repositories.
the relationship between SRA, Phenotype, and Gene Expression?
The issue has been addressed by bioCaddie team.
In its current version, dataMED appears to have over 20 different data sources representing 65k data sets and 10 different “data types”. This number is only a small fraction of the number of publicly available databases as reported in the annual Nucleic Acids Research database issue (see next paragraph).
For example, although the link to data types appears to divide the results into some functional categories, it is not clear that the choice of these categories is based on data definition standards or ontologies. Therefore, a standard vocabulary and ontology-driven data model that fully captures the scientific use cases should precede the development of a Data Discovery Index (DDI). Moved to #231