Closed R2Bb1T closed 10 months ago
Thank you for using causallib and taking the time to report this problem.
I'll admit models in the contrib
module come with limited warranty, but I'll do my best to assist.
First, could you please provide a minimal code example that reproduces the problem?
Second, I'll ping @chiragnagpal (Hi Chirag! 👋 🙃 ), the paper's first author and the person who implemented the model, to see if he can find the time to take a look and see what he can make of it.
Thanks for your reply and help! Here is the example code:
from causallib.contrib.hemm import HEMM
import causallib.contrib.hemm.gen_synthetic_data as gen_synthetic_data
import causallib.contrib.hemm.hemm_outcome_models as hemm_outcome_models
import numpy as np
import pandas as pd
def generate_traindata():
d = 100
X, T, Y, Z, mu1, mu0 = gen_synthetic_data.gen_data(n=50000, d=d)
HTE = mu1 - mu0
data = np.column_stack((X, Y, T, HTE, mu1, mu0, Z))
cols = ['x' + str(i) for i in range(d+1)]
output_train_data = pd.DataFrame(data, columns=cols + ['Y', 'T', 'HTE', 'Y0', 'Y1', 'Z'])
return output_train_data
train_data = generate_traindata()
hemm=HEMM(D_in=D_in,K=K,bc=bc,lamb=lamb,mu=mu,std=std,response=response,metric='AuROC',outcome_model=hemm_outcome_models.genMLPModule(D_in=D_in, H=2, out=2))
losses=hemm.fit(train_data.iloc[:,:100].values,train_data['T'].values,train_data['Y'].values)
K=2, batch_size = 30, other parameters remain default. I think it may be a overfitting problem, but still don't know the reason that 'std' kept decreasing and fell below zero.
Hi @ehudkr thanks for connecting me.. ! It's been good, just trying to finish wrapping up my thesis and move on to newer things : )
@R2Bb1T I looked at the code, and it indeed seems like the model isn't constraining the std
variable to be positive. One simple way around this is to add a relu
activation on the std
parameter.
I can try pushing that fix to the code, but its prolly faster for you to fix it at your end before waiting for it to be reflected on the next release.
@chiragnagpal Thanks for your reply! The solution you mentioned worked! The result of ITE seems right. But the result of subgroup predicted does not match the truth subgroup very well. I kept the data d=2 to simulate the experiment in the paper. And I found that even all the hyperparameters are kept same and randomly generate similar data to experiment several times, the result can have a huge difference. The visualization varies a lot, sometimes even cannot form a circle, if do the center and the radius seems deviated, and I calculated the AUC between subgroup 'z' predicted and the truth 'Z', it also varies from 0.5 to 0.7. Is any procedure I conducted wrong or it does have the problem?
This time I change the d to 2, n to 1000, batchsize to 10, lamb to 0.1 and the threshold of posen and negen to 0.25.
Thanks for your amazing work! I had some data tested with the HEMM method and the result of subgroup prediction is abnormal. With further evaluation, I found that the parameter 'std' of the Gaussian distribution keeps decreasing and fell below zero while it supposed to converge to a positive value. What's the cause of this phenomenon and how can I fix it? Is this about parameter initialization?