TF Probability Layer 1 examples?

flipir commented 6 years ago

I'm excited about using TF Probability to the research work I do in education (formally called "institutional research"), and have been developing a system using wide and deep TF.

What I've been hoping for is a way to utilize TF statistics underpinnings to show treatment effect, specifically between a specific program a subset of students are in and the general population of students. In the past I've used statistics like Students t-test for this.

I can't find any documented full examples of how these Layer 1 probability features are used in full example. It would be ideal I could set up configure a studentT to run alongside existing classification estimators (DNNLinearCombinedClassifier) and read/utilize data from the classifier. Can the Layer 1 probability functions read/utilize tensors or graph? Or, is it that while the statistical functionality is part of TF, it can be used outside of estimators but not alongside or within them.

dustinvtran commented 6 years ago

@derifatives wrote a multivariate normal and student T regression example in a public Colab (see bottom).

http://goo.gl/PHGNkQ

@derifatives: Would you be open to expanding it out as a TF example or tutorial?

derifatives commented 6 years ago

@flipir: Yes, the Layer 1 functions can certainly read/utilize tensors and the TF graph. There's an example of this embedded in the colab @dustinvtran linked to. However, I'm also happy to make a more specific example for you if you can help me better understand what you need. I'm not 100% sure what "configure a studentT to run alongside existing classification estimators" means. If you just want to fit a studentT to some data using TF, look for the example in the colab that contains fit_loc_scale_dist.

Happy to iterate on this.

flipir commented 6 years ago

@dustinvtran - Thanks for this. I appreciate your help and time.

flipir commented 6 years ago

@derifatives - I appreciate this information. To clarify on the question about "studentT to run alongside existing classification"... I mean... If I build a TF graph that runs classification training and testing, can I include a studentT against two of the features so that both the classification training and the studentT are processed in the same session?

derifatives commented 6 years ago

@flipir: I believe what you want should be both possible and easy, but I still don't 100% get it. Are you saying that you have a bunch of features being classified, and for two of the features, you want to also simultaneously fit a student-T distribution via gradient descent / maximum likelihood? Do you want to share your current training graph? (You said a TF graph that runs classification training and testing, but I'd usually expect those to be two separate graphs.)

flipir commented 6 years ago

@derifatives: I think it's easiest if I step back and give a larger picture of what I'm after. The real-world domain I'm in is education...I'm trying to test whether I can use TF to train a bunch of features related to a simple graduated on time (1) / didn't graduate on time (0) label. I would used the trained system to predict a student's outcome based on a student's features at a specific point in time. I've basically got that working with TF, and am using a NN (deep) model.

The questions I'd like to answer are: 1) what feature(s) have the greatest impact on the outcome based on a set of input features for a specific student, and 2) if one feature is prominently affecting the outcome, whether I can get a student-T distribution to show the statistical significance of that feature's effect in this case. I would like to be able to use the student-T result for external reports (and write-up the statistical results).

I haven't been asking the "1)" question in this post because as I understand it, one really can't get feature attribution from a NN. There are tools for getting feature attribution from a linear regression model (like SHAP and LIME) but I haven't seen good examples of those tools used with TF and I'm really interested in using the deep learning model.

So, this post really is about the "2)" question, but I didn't provide a decent breakdown of what the two compared samples are. It really wouldn't be two features that would represent the two samples, it would be a split of the rows of the feature itself. If the feature is "Attended New Student Proseminar" and is set to 1 when the student attended and 0 when they didn't, I'd use all of the 0 rows as the control and the 1 rows as the experimental group and run an unpaired samples t test. I'd like to be able to do that using TF as opposed to running this in R or some other statistics package.

When I write in earlier posts in this thread that I'd like to have this (training the classifier, preparing student-T for several features that I find most likely effecting outcomes) in one session, I'm wondering whether it could be cleanly done that way. If the answer is that it's better to split up these activities, that's OK. I just didn't know.

brianwa84 commented 5 years ago

It seems like the tfp.glm module could be a good resource for such interpretable linear models. IIRC there is a demo notebook under examples/jupyter_notebooks.

tensorflow / probability

TF Probability Layer 1 examples? #79