Tools to aid in the harmonization effort of RADx data elements.
This project consists of two components: tooling for finding potentially harmonizable elements and a webapp for reviewing these findings.
The harmonization process consists of two steps: grouping and analysis.
python3 -m venv venv
source ./venv/bin/activate
pip install -r requirements.txt
Generating groupings:
source_file=data/generated/merged.csv
grouping_file_path=generated/$(date +%F)-keybert-groupings.csv
python3 -m cde_harmonization categorize $source_file $grouping_file_path -v -f label -f description -c keybert
To get all the options for grouping generation, run python3 cde_harmonization/cli.py categorize -h
Running analysis using the groupings:
score_threshold=0.7
analysis_file_path=generated/$(date +%F)-keybert-analysis-$score_threshold.csv
python3 -m cde_harmonization analyze $grouping_file_path $analysis_file_path -a use4 -g intersection -f label -f description -s $score_threshold -v
To get all the options for grouping generation, run python3 cde_harmonization/cli.py analyze -h