Improve/Add automatic detection of variables and tools to build cohorts without configuration files.

legaultmarc / cohort-manager

Utility to manage and explore collections of phenotype data.

2 stars 2 forks source link

I started refactoring to use the cohort_manager.inference module for such things (will be pushed soonish). The interface for the REPL could look like this:

> import csv my_file.csv delim=',' header=0
# Found 5 columns, verify the following information, then press enter:
[
    {"name": "Name", "variable_type": None},
    {"name": "Age", "variable_type": "continuous"},
    {"name": "Height", "variable_type": "continuous"}
    {"name": "Tall", "variable_type": "discrete", "parent": "height"},
    {"name": "FavoriteWeather", "variable_type": "factor"},
]

Users could also add the other meta fields (e.g. {"icd10": ...}). In this example, I also correctly inferred the parent relationship between "Tall" and "Height". This will be a lot harder in practice.

legaultmarc / cohort-manager

Improve/Add automatic detection of variables and tools to build cohorts without configuration files. #13