At this point, there are good utilities to explore the contents of individual variables, but there is no way of exploring the set of variables. As an example, it's not possible to identify all the variables with a missing rate > x using only the REPL. It would be nice to have a syntax to do this kind of analysis. Here is an example of the questions we might want to answer:
[ ] What are the variables with more than X% missing values.
[ ] Build a contingency table of Angina vs. gender.
[ ] Identify all the discrete variables.
[ ] Identify all the continuous variables that have a qq plot r squared of less than 0.9.
At this point, there are good utilities to explore the contents of individual variables, but there is no way of exploring the set of variables. As an example, it's not possible to identify all the variables with a missing rate > x using only the REPL. It would be nice to have a syntax to do this kind of analysis. Here is an example of the questions we might want to answer: