NIAID-Data-Ecosystem / nde-portal

Discovery platform to find NIAID-related datasets and tools
Apache License 2.0
4 stars 1 forks source link

Metadata optimization of Species filters #198

Open EmilyHaag opened 1 year ago

EmilyHaag commented 1 year ago

One user noted that while searching for Histoplasma (a a lung fungal pathogen of humans), some of the data currently parsed as Host Species should be Pathogen Species. For example, these are falling under Host Species but are names of the Pathogen Species: "ajellomyces capsulatus" and "histoplasma ohiense".

gtsueng commented 1 year ago

The cause of this appears to at least partially due to the nature of the data. For repositories hosting transcriptomics/sequencing data, the data may not necessarily reflect a host-pathogen interaction. As such, the species property in those repositories appear to be set as whatever species was used for the sequencing...in many cases, an infectious agent (or pathogenic species). In such repositories, there may simply be one property "species" to reflect the source of the sequence, and this property was mapped to our schema property, "species."

gtsueng commented 1 year ago

This has been determined to be part of the metadata standardization/augmentation work scoped in the RFP/WBS to be implemented later.