Open jamiefogel opened 1 year ago
I noticed that for iota-gamma mle_data['sum_logomega_ig']
has one fewer column than mle_data['sum_count_ig']
, which implies that the code is correct. So the question is why these objects have the same shape for iota-occ2Xmeso_recode
mle_data['sum_logomega_ig'].shape
torch.Size([852, 1154])
>>> mle_data['sum_count_ig'].shape
torch.Size([852, 1155])
I think I know what the problem is. Gamma is coded as 0 for the non-employed but this isn't the case for other variables potentially used as markets, e.g. occ2_recode
, occ4_recode
, and occ2Xmeso_recode
. These variables are coded from 0 to N, so 0 just represents an ordinary category. However, iota and gamma are both coded from 1 to N+1 with -1 representing missing and gamma=0 corresponding to non-employment (there is no iota=0 because non-employment doesn't make sense as a state for classifying a worker's type in our context; only for classifying the job they're currently matched with).
I think the solution, which I'm about to implement, is to have all of the _recode variables range from 1 to N+1 rather than 0 to N. I'll have to think about whether or not this causes any problems elsewhere, but the fact that iota and gamma are already coded this way implies that it should be fine.
Got the following error in the second to last line of
loglike_sums()
withintorch_mle()
RuntimeError: The size of tensor a (1420) must match the size of tensor b (1421) at non-singleton dimension 1
https://github.com/jamiefogel/Networks/blob/be1888b125805185e5032858b8a0e7ed6c963ae9/Code/Modules/torch_mle.py#L74
The root of the issue seems to be that we have
mle_data['sum_logomega_ig']
andmle_data['sum_count_ig'][:,1:]
when the dimensions ofmle_data['sum_logomega_ig']
andmle_data['sum_count_ig']
are the same. I don't know what the[:1:]
is there for or why it doesn't fail for other wtype-jtype pairs.