pymc-devs / pymc4

(Deprecated) Experimental PyMC interface for TensorFlow Probability. Official work on this project has been discontinued.
Apache License 2.0
712 stars 113 forks source link

Add logistic regression example #154

Closed kyleabeauchamp closed 5 years ago

kyleabeauchamp commented 5 years ago

I thought it might be nice to add a logistic regression example at some point. I've tried to do this (see below code) but I'm having some issues with constant chains and the following warning:
"tensorflow/core/common_runtime/executor.cc:642] Executor failed to create kernel. Internal: No function library [[{{node MatVec/MatMul/pfor/cond}}]]"

I'm running on TFP nightly (Oct 15), TFP 2.0, and python 3.6. Anyone see anything obviously wrong here? I'm happy to add a notebook or test case for logistic regression once we get things running.

import numpy as np
import pymc4 as pm4
import tensorflow as tf
from sklearn.datasets import load_breast_cancer

data = load_breast_cancer()
X = data["data"].astype('float32')
# Standardize to avoid overflow issues
X -= X.mean(0)
X /= X.std(0)
y = data["target"]

n_samples, n_features = X.shape

X = tf.constant(X)

@pm4.model
def logistic_model():
    w = yield pm4.Normal("w", np.zeros(n_features, 'float32'), 0.01 * np.ones(n_features, 'float32'))

    z = tf.linalg.matvec(X, w)
    p = tf.math.sigmoid(z)

    obs = yield pm4.Bernoulli("obs", p, observed=y)
    return obs

def test_sample():
    tf_trace = pm4.inference.sampling.sample(
        logistic_model(), step_size=0.01, num_chains=1, num_samples=200, burn_in=0, xla=False
    )
    return tf_trace

tr = test_sample()
brianwa84 commented 5 years ago

At a glance I wonder if autograph is being invoked inside vectorized_map (which is used by pm4 sample). I believe vmap applies tf.function to the resulting computation, which perhaps is implicitly using autograph. Still, I'm a bit surprised to see the cond showing up. Will take a closer look tomorrow.

davmre commented 5 years ago

For what it's worth, this example appears to work inside of a colab: (edit: updated link) https://colab.research.google.com/drive/1W4Lh5ow9l8-9hODphDTKAg0gLNvZn7Yd

(well some of the results are infs, but at least it executes).

ColCarroll commented 5 years ago

Having trouble accessing @davmre's helpful link, but can confirm that it runs without error -- here's another try at a link: https://colab.research.google.com/drive/1tEyXP-MY-M3eruAQ9dMTWqihQF8jo0Xq

davmre commented 5 years ago

Sorry about the permissions! (it looks like even on the public colab servers you can't publicly share colabs from an @google.com account). I've copied the colab over to my personal account and edited the link above, so hopefully it'll work now.

kyleabeauchamp commented 5 years ago

I can confirm that the colab executes, but it has the same behavior as I saw: a totally constant trace. You can see this as follows:

w_tr = tr[0]["logistic_model/w"].numpy()
w_tr[:, 0].std(0)
array([3.5622872e-08, 5.2154064e-08, 1.0058284e-07, 1.8626451e-08,
       2.3655593e-07, 1.3038516e-08, 6.6123903e-08, 8.1956387e-08,
       3.4924597e-08, 6.3329935e-08, 8.2887709e-08, 5.2154064e-08,
       1.1641532e-08, 8.9406967e-08, 6.7055225e-08, 8.1956387e-08,
       4.0978193e-08, 1.2107193e-07, 4.4237822e-08, 6.7986548e-08,
       1.2200319e-07, 2.7008355e-08, 1.8067658e-07, 5.5879354e-09,
       1.1175871e-08, 1.5832484e-08, 7.8231096e-08, 5.3085387e-08,
       9.1269612e-08, 2.3515895e-08], dtype=float32)
kyleabeauchamp commented 5 years ago

OK I think things are working now, with an order of magnitude more samples. NUTS is a bit picky in the early tuning stages I suppose.