jamiefogel / Networks

0 stars 0 forks source link

Problem with MLE w/ iota and occ2Xmeso_recode #18

Open jamiefogel opened 1 year ago

jamiefogel commented 1 year ago

Got the following error in the second to last line of loglike_sums() within torch_mle() RuntimeError: The size of tensor a (1420) must match the size of tensor b (1421) at non-singleton dimension 1

https://github.com/jamiefogel/Networks/blob/be1888b125805185e5032858b8a0e7ed6c963ae9/Code/Modules/torch_mle.py#L74

The root of the issue seems to be that we have mle_data['sum_logomega_ig'] and mle_data['sum_count_ig'][:,1:] when the dimensions of mle_data['sum_logomega_ig'] and mle_data['sum_count_ig'] are the same. I don't know what the [:1:] is there for or why it doesn't fail for other wtype-jtype pairs.

jamiefogel commented 1 year ago

I noticed that for iota-gamma mle_data['sum_logomega_ig'] has one fewer column than mle_data['sum_count_ig'], which implies that the code is correct. So the question is why these objects have the same shape for iota-occ2Xmeso_recode

mle_data['sum_logomega_ig'].shape
torch.Size([852, 1154])
>>> mle_data['sum_count_ig'].shape
torch.Size([852, 1155])
jamiefogel commented 1 year ago

I think I know what the problem is. Gamma is coded as 0 for the non-employed but this isn't the case for other variables potentially used as markets, e.g. occ2_recode, occ4_recode, and occ2Xmeso_recode. These variables are coded from 0 to N, so 0 just represents an ordinary category. However, iota and gamma are both coded from 1 to N+1 with -1 representing missing and gamma=0 corresponding to non-employment (there is no iota=0 because non-employment doesn't make sense as a state for classifying a worker's type in our context; only for classifying the job they're currently matched with).

I think the solution, which I'm about to implement, is to have all of the _recode variables range from 1 to N+1 rather than 0 to N. I'll have to think about whether or not this causes any problems elsewhere, but the fact that iota and gamma are already coded this way implies that it should be fine.