py-why / EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
https://www.microsoft.com/en-us/research/project/alice/
Other
3.79k stars 713 forks source link

label:"help wanted" Matching Different Input Shapes #91

Closed kivo360 closed 5 years ago

kivo360 commented 5 years ago

I'm running the following code to test different input shapes for DeepIV Estimator. I've been able to change input shapes via the input dimensions so far. The problem is that I've only been able to have a single dimension for my covariants, instruments, and treatments. I'm certain that's not the only way. I'm pretty sure the answer could be pretty simple, though I'm kind of jumping onto the deep-end with this library. I don't quite understand how.

To clarify the question is: How do I match different input shapes?

EconML Version: 0.4 Operating System: Ubuntu 18.10

Code that works:

from econml.deepiv import DeepIVEstimator
import keras
import numpy as np
import matplotlib.pyplot as plt
from count_service.primatives.base_models import treatment_model, response_model, create_callback

treatment_model = keras.Sequential([keras.layers.Dense(128, activation='relu', input_dim=8),
                                    keras.layers.Dropout(0.17),
                                    keras.layers.Dense(64, activation='relu'),
                                    keras.layers.Dropout(0.17),
                                    keras.layers.Dense(32, activation='relu'),
                                    keras.layers.Dropout(0.17)])

response_model = keras.Sequential([keras.layers.Dense(128, activation='relu', input_dim=8),
                                keras.layers.Dropout(0.17),
                                keras.layers.Dense(64, activation='relu'),
                                keras.layers.Dropout(0.17),
                                keras.layers.Dense(32, activation='relu'),
                                keras.layers.Dropout(0.17),
                                keras.layers.Dense(1)])

treatment_model2 = keras.Sequential([keras.layers.Dense(128, activation='relu', input_dim=2),
                                    keras.layers.Dropout(0.17),
                                    keras.layers.Dense(64, activation='relu'),
                                    keras.layers.Dropout(0.17),
                                    keras.layers.Dense(32, activation='relu'),
                                    keras.layers.Dropout(0.17)])

response_model2 = keras.Sequential([keras.layers.Dense(128, activation='relu', input_dim=2),
                                keras.layers.Dropout(0.17),
                                keras.layers.Dense(64, activation='relu'),
                                keras.layers.Dropout(0.17),
                                keras.layers.Dense(32, activation='relu'),
                                keras.layers.Dropout(0.17),
                                keras.layers.Dense(1)])

n = 5000

# Initialize exogenous variables; normal errors, uniformly distributed covariates and instruments
e = np.random.normal(size=(n,1))
x = np.random.uniform(low=0.0, high=10.0, size=(n,1))
z = np.random.uniform(low=0.0, high=10.0, size=(n,1))

e_single = np.random.normal(size=(n,))
x_single = np.random.uniform(low=0.0, high=10.0, size=(n,))
z_single = np.random.uniform(low=0.0, high=10.0, size=(n,))

e_large = np.random.normal(size=(n,3))
x_large = np.random.uniform(low=0.0, high=10.0, size=(n,3))
z_large = np.random.uniform(low=0.0, high=10.0, size=(n,3))

t_single = np.sqrt((x_single+2) * z_single) + e_single

y_single = t_single*t_single / 10 - x_single*t_single / 10 + e_single

# Initialize treatment variable
t = np.sqrt((x+2) * z) + e

# Outcome equation 
y = t*t / 10 - x*t / 10 + e

t_large = np.sqrt((x_large+2) * z_large) + e_large
y_large = t_large*t_large / 10 - x_large*t_large / 10 + e_large

deepIvEst = DeepIVEstimator(n_components = 10, # number of gaussians in our mixture density network
                            m = lambda z, x : treatment_model2(keras.layers.concatenate([z,x])), # treatment model
                            h = lambda t, x : response_model2(keras.layers.concatenate([t,x])),  # response model
                            n_samples = 1, # number of samples to use to estimate the response
                            use_upper_bound_loss = False, # whether to use an approximation to the true loss
                            n_gradient_samples = 1, # number of samples to use in second estimate of the response (to make loss estimate unbiased)
                            optimizer='adam', # Keras optimizer to use for training - see https://keras.io/optimizers/ 
                            first_stage_options = create_callback(), # options for training treatment model
                            second_stage_options = create_callback()) # options for training response model

deepIvEst.fit(Y=y_single,T=t_single,X=x_single,Z=z_single)
for i, x in enumerate([2, 5, 8]):
    t = np.linspace((1, 4),(1, 20),num = 1000)
    y_true = t*t / 10 - x*t/10
    fuller = np.full_like(t, x)
    y_pred = deepIvEst.predict(fuller, np.full_like(fuller, x))

Code that doesn't work:

n = 5000

# Initialize exogenous variables; normal errors, uniformly distributed covariates and instruments
e = np.random.normal(size=(n,1))
x = np.random.uniform(low=0.0, high=10.0, size=(n,1))
z = np.random.uniform(low=0.0, high=10.0, size=(n,1))

e_single = np.random.normal(size=(n,))
x_single = np.random.uniform(low=0.0, high=10.0, size=(n,))
z_single = np.random.uniform(low=0.0, high=10.0, size=(n,))

e_large = np.random.normal(size=(n,3))
x_large = np.random.uniform(low=0.0, high=10.0, size=(n,3))
z_large = np.random.uniform(low=0.0, high=10.0, size=(n,3))

t_single = np.sqrt((x_single+2) * z_single) + e_single

y_single = t_single*t_single / 10 - x_single*t_single / 10 + e_single

# Initialize treatment variable
t = np.sqrt((x+2) * z) + e

# Outcome equation 
y = t*t / 10 - x*t / 10 + e

t_large = np.sqrt((x_large+2) * z_large) + e_large
y_large = t_large*t_large / 10 - x_large*t_large / 10 + e_large

deepIvEst = DeepIVEstimator(n_components = 10, # number of gaussians in our mixture density network
                            m = lambda z, x : treatment_model2(keras.layers.concatenate([z,x])), # treatment model
                            h = lambda t, x : response_model2(keras.layers.concatenate([t,x])),  # response model
                            n_samples = 1, # number of samples to use to estimate the response
                            use_upper_bound_loss = False, # whether to use an approximation to the true loss
                            n_gradient_samples = 1, # number of samples to use in second estimate of the response (to make loss estimate unbiased)
                            optimizer='adam', # Keras optimizer to use for training - see https://keras.io/optimizers/ 
                            first_stage_options = create_callback(), # options for training treatment model
                            second_stage_options = create_callback()) # options for training response model

deepIvEst.fit(Y=y_single,T=t_single,X=x_large,Z=z_large)
for i, x in enumerate([2, 5, 8]):
    t = np.linspace((1, 4),(1, 20),num = 1000)
    y_true = t*t / 10 - x*t/10
    fuller = np.full_like(t, x)
    y_pred = deepIvEst.predict(fuller, np.full_like(fuller, x))

I get this error (rightfully so)

Traceback (most recent call last):
  File "count_service/junk/direct_sample.py", line 81, in <module>
    deepIvEst.fit(Y=y_single,T=t_single,X=x_large,Z=z_large)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/econml/deepiv.py", line 323, in fit
    treatment_network = self._m(z_in, x_in)
  File "count_service/junk/direct_sample.py", line 69, in <lambda>
    m = lambda z, x : treatment_model2(keras.layers.concatenate([z,x])), # treatment model
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/keras/engine/base_layer.py", line 451, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/keras/engine/network.py", line 570, in call
    output_tensors, _, _ = self.run_internal_graph(inputs, masks)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/keras/engine/network.py", line 727, in run_internal_graph
    layer.call(computed_tensor, **kwargs))
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/keras/layers/core.py", line 908, in call
    output = K.dot(inputs, self.kernel)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 1133, in dot
    out = tf.matmul(x, y)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 2647, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 5925, in mat_mul
    name=name)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3616, in create_op
    op_def=op_def)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2027, in __init__
    control_input_ops)
  File "/home/kevin/.local/share/virtualenvs/RayServices-dMeB-KyP/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1867, in _create_c_op
    raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 6 and 2 for 'sequential_5/dense_15/MatMul' (op: 'MatMul') with input shapes: [?,6], [2,128].

The key difference here is what I'm training the program with.

I've been reading up on input shapes, and part of what I'm reading suggests that I might be approaching this problem wrong. I can get input shapes of equal size working. How should I approach this?

kivo360 commented 5 years ago

I figured out through a friend that I need to keep the same size and pad the information inside of each vector based on what operation I'm working.

I'd pad using 0 for additive operations, and 1 for multiplicative.

kbattocchi commented 5 years ago

@kivo360 Sorry for not responding to this more promptly, but glad you were able to resolve it.