GrowthViz was developed in partnership between the Health FFRDC and CDC, with feedback from leading health researchers, to support post-processing and data visualization of growthcleanr output.
The objective of this tool is to allow users to conduct post-processing and data visualization of growthcleanr output. growthcleanr is an automated method for cleaning longitudinal pediatric growth data from EHRs. It provides an environment that includes graphical user interfaces as well as interactive software development to explore data.
The latest code for this project should run GrowthViz.ipynb
.
The notebook requires Python 3, Jupyter Notebook, Pandas, Matplotlib and Seaborn. Some widgets also require the Qgrid extension enabled in Jupyter. The .csv
files in the repository are the source data required to run the notebook. Custom data should replace these files in the same format. For more details see the simple install instructions below.
The objective of this tool is to allow users to conduct post-processing and data visualization of growthcleanr output. growthcleanr is an automated method for cleaning longitudinal pediatric growth data from EHRs. It is available as open source software. GrowthViz is to be used after a data set has been run through growthcleanr.
As stated in Automated identification of implausible values in growth data from pediatric electronic health records:
In pediatrics, evaluation of growth is fundamental, and many pediatric research studies include some aspect of growth as an outcome or other variable. The clinical growth measurements obtained in day-to-day care are susceptible to error beyond the imprecision inherent in any anthropometric measurement. Some errors result from minor problems with measurement technique. While these errors can be important in certain analyses, they are often small and generally impossible to detect after measurements are recorded. Larger measurement technique errors can result in values that are biologically implausible and can cause problems for many analyses.
GrowthViz uses data sets that were produced by growthcleanr. The tool expects the output to be in a CSV format that is described later on in the notebook.
GrowthViz is a Juypter Notebook. It provides an environment that includes graphical user interfaces as well as interactive software development to explore data. To achieve this, GrowthViz references different software languages and packages:
Anaconda is an all-in-one package installer for setting up dependencies needed to run and view GrowthViz.
Search Packages
text box in the top center of the screen. If it shows up with a green checkbox, proceed to Step 6.GrowthViz-master
folder you downloaded and unzipped in Step 2 (likely found in your Downloads/ folder). Click on GrowthViz.ipynb
to run the Python notebook.By default when you reach Step 6 of the Simple Install instructions above the notebook will use sample data loaded from the .csv
files located in the GrowthViz-master project.
To ensure that all of the necessary example files are present, run the check_setup.py
script.
Docker allows for the ability to download GrowthViz and its dependencies in an environment. To use this method, download and install Docker Desktop
docker run -it -p 8888:8888 -v [data-path]/growthviz-data:/usr/src/app/growthviz-data mitre/growthviz
[data-path]
with a directory path you choose on your local computer. For instance, I choose: ~/Documents
which means that a folder named /growthviz-data
will be created in my documents folder. When I want to input my own data in to GrowthViz, I can simply drop my CSV files in this /growthviz-data
folder.http://X.X.X.X:8888/?token=XXX...
GrowthViz.ipynb
. This will open a new window with the GrowthViz Jupyter Notebook.
Run
button to step through the various blocks (cells) of the document, OR click the 'Cell' dropdown in the menu bar and select 'Run all' to test the entire notebook all at once. However, this will run with the default sample data. Step 4 will explain how to use your own data.
[name-of-your-file.csv]
into the /growthviz-data
folder you created in step 1.cleaned_obs = pd.read_csv("sample-data-cleaned.csv")
withcleaned_obs = pd.read_csv("growthviz-data/[name-of-your-file.csv]")
/growthviz-data
folder.When you run all cells (see Step 8 above) Out[#]:
boxes will appear in the notebook below the In[#]:
code cells. These outputs are the result of the functioning code blocks on the data. The out blocks will often be interactive charts and graphs used to explore the growthcleanr data. Descriptions of each Out[#]:
block can be found in the text sections above the In[#]:
blocks.
Copyright 2020 The MITRE Corporation.
Approved for Public Release; Distribution Unlimited. Case Number 19-2008