NIAID-Data-Ecosystem / nde-crawlers

Harvesting infrastructure to collect and standardize dataset and computational tool metadata
Apache License 2.0
0 stars 0 forks source link

[Metadata Improvement]: Standardize Measurement Techniques #127

Open gtsueng opened 3 months ago

gtsueng commented 3 months ago

Issue Name

Standardize Measurement Techniques

Issue Description

Similar to what we've done for healthConditions, species, infectiousAgent, we need to standardize the ingested values for measurementTechnique. This may be considerably more difficult than the standardization of healthConditions for which there was a hierarchy of ontologies to use.

The ontologies

In the case of measurementTechnique, relevant terms can be found in specific branches from eight different ontologies:

Potential approaches

Next steps

Issue Discussion

The start of this activity has been discussed in multiple bi-weekly meetings in Q1 2024

Please select the type of metadata improvement

Meta URL

No response

Related WBS task

https://github.com/NIAID-Data-Ecosystem/nde-roadmap/issues/13

For internal use only. Assignee, please select the status of this issue

Status Description

No response

Request status check list

gtsueng commented 3 months ago

@DylanWelzel it looks like CHMO merged their terms with other ontologies and deprecated the CHMO-specific identifiers associated with those terms.

From CHMO, what we're interested in are the children of http://purl.obolibrary.org/obo/OBI_0000070 (formerly CHMO_0000000 CHMO_0001133)

It's possible to parse this from the owl file by recursively getting terms which are subclassOf OBI_0000070, but hopefully BioPortal will have an easier way

gtsueng commented 2 months ago

@DylanWelzel There seems to be significant overlap between SNOMEDCT and NCIT. Let's ensure that we have what we need from NCIT since SNOMEDCT and NCIT have different hierarchical structures.

To do: