Open EmilyHaag opened 1 year ago
The cause of this appears to at least partially due to the nature of the data. For repositories hosting transcriptomics/sequencing data, the data may not necessarily reflect a host-pathogen interaction. As such, the species property in those repositories appear to be set as whatever species was used for the sequencing...in many cases, an infectious agent (or pathogenic species). In such repositories, there may simply be one property "species" to reflect the source of the sequence, and this property was mapped to our schema property, "species."
This has been determined to be part of the metadata standardization/augmentation work scoped in the RFP/WBS to be implemented later.
One user noted that while searching for Histoplasma (a a lung fungal pathogen of humans), some of the data currently parsed as Host Species should be Pathogen Species. For example, these are falling under Host Species but are names of the Pathogen Species: "ajellomyces capsulatus" and "histoplasma ohiense".