DouglasPatton / vbflow

1 stars 2 forks source link

UI Dashboard Panels #8

Open mikecyterski opened 2 years ago

mikecyterski commented 2 years ago

Initial Visualization Design:

                                         DATA EXPLORATION 1

Plots for All Features (pop-up window, accessed from some central location - maybe by right-clicking at the top left of the data table)

Missing data by column and row image

Missing data by column image

Missing data by row image

Correlation matrix across all features image

Dendrogram of feature clustering image

                                            DATA EXPLORATION 2

Plots for Individual Features/Response (pop-up window accessed by right-click of the column header?)

Time-series of active column image

Density plot: the number of observations within binned range of column values image

Scatterplot of active column vs. response (with correlation included) image

Box and whiskers plot for the response values within each level of active categorical column image

Box and whiskers plot for values within active continuous column image

Pie chart of number of observations per level of active categorical column image

                                  TRAINING DASHBOARD: 1x2 (row x column) panel layout

(For the plots that follow, user can toggle showing/hiding results for any of the estimators)

Fitted values versus actual response values across cv-folds and reps DEFAULT image

Fitted values and actual values versus row number (after ordering by actual value) across cv-folds and reps OPTIONAL image

Fitted values and actual values (i.e., a time series plot) across cv-folds and reps OPTIONAL Same as previous plot, only the x-axis is original row number, not row number ordered by response value

Model scores/metrics for each estimator as a boxplot DEFAULT image

Model scores/metrics for each estimator as a lineplot OPTIONAL image

For linear models, a table of feature coefficients and their significance OPTIONAL image

For machine learning models, a table/plot of feature influence values OPTIONAL image

For machine learning models, a partial dependence plot (PDP), showing how the response variable varies across the range of the chosen feature OPTIONAL image

                                  PREDICTION DASHBOARD: 1x2 panel layout

For chosen model, a new prediction, its prediction interval, and original training data results across cv-folds and reps: ACTUAL y is UNKNOWN image

For chosen model, new prediction, its prediction interval, and original training data results across cv-folds and reps: ACTUAL y is KNOWN image

mikecyterski commented 2 years ago

The above plots are relevant for regression problems (continuous response variable). A different set of plots (and analytical techniques) must be generated for classification problems (categorical response variable). This type of analysis would seem to be a fairly important addition to the WebVB toolbox.