a note on the link function of z in HDDMregressors

Hi all. I fitted a series of HDDMregression models that included regressor on z. Based on my prior experience, I used inverse logit function as link function for z, which is also the recommended approach in HDDM tutorial. However, I realized this inverse logit function leads to several problems in our data:

the z_Intercept values can be hard to interpret because I found the majority of them were negative.
it was challenging for different chains to converge when estimating z_Intercept
the PPC showed a bad fit, which could be related to the convergence problem.
az.compare can not estimate model comparison indexes (WAIC&loo=NaN)

I searched for solutions for this problem in the hddm google groups, and found that this issue has been discussed before. Simply put, when including regressors on z, one should use a simple identity link function for z, too. This was achieved due to some improvements in recent versions of HDDM.

> Regarding what link function to use for z: in the past, it has been suggested to use an inverse logit but that is now already incorporated in the prior. Therefore, one should instead just use the linear link function (lambda x:x - this just means that the sampler will directly estimate z instead of a transform of it). See this thread. For v, the link function usually is identity linear also because it is not constrained.

> So, with recent versions of HDDM, one should not use an inverse logit transform on z in the link function for regression models (unless they change the prior in the guts of their version of HDDM). It is sufficient to use the constrained prior on the intercept with the regular linear link function (any extreme values of z that would go out of the [0,1] bounds on the full regression would get rejected by the sampler anyway). This also makes z regression coefficients more comparable to those for other models parameters (which are all usually linear), and easier to interpret the coefficients.

In your dockerHDDM paper, only v-related regression models were illustrated. I would suggest adding reminders on the link function of z anywhere in dockerHDDM code/tutorial, as it can be easy to run into this problem. : ）

Best, Xiaoyu

Another note, to recover the transferred z_intercept to [0,1], you can simply applied inverse logit transform:

ms3res = kabuki.utils.concat_models(models['ms3'])
z_Intercept_subj_trans_1920 = ms3res.get_traces()['z_Intercept_subj_trans.1920'] 
np.mean(z_Intercept_subj_trans_1920)

or using inverse logit in pymc,

traces_z = pm.invlogit(z_Intercept_subj_trans_1920)
np.mean(traces_z)

In my experience, applying once inverse logit on the z_Trans should recover the z_Trans to the raw z that ranges from 0 to 1.

But I'm still not fully confident about this transformation (see the referred link, John applied twice inverse logit; I've asked why he did this to recover z_trans), correct me if I was wrong about this : )

reference: https://groups.google.com/g/hddm-users/c/IvRzauvCul0/m/Dy5qjC4vBAAJ

hcp4715 / dockerHDDM

a note on the link function of z in HDDMregressors #10