(1d) Rapid identification of possible harmonization candidates

graybeal commented 2 years ago

(1d) Rapid identification of highly similar data elements as possible harmonization candidates [@frostyfan109]

graybeal commented 2 years ago

This is a second stage of #4, where initial broad classifications are automatically refined to find the most likely harmonizable candidates. In some algorithms both could be done at the same time, or more interim steps could be defined.

The goal of this step is to produce a set of possible harmonization candidates that can be manually curated to find likely harmonizable data elements.

We also want the output of this process to contain enough information to evaluate (later) whether the identified candidates were suitable, and whether further refinement might be possible.

graybeal commented 2 years ago

See also https://github.com/frostyfan109/cde-harmonization/issues/2 for more detailed considerations for this activity.

graybeal / radx-data-dictionary-analysis

(1d) Rapid identification of possible harmonization candidates #5