Open Steven-1124 opened 10 months ago
Hello, so all implementations in the repo expect the input to be a tuple of (categorical inputs, continuous inputs)
of tensors of dimensions (batch size, num dim categorical/continuous)
. The argument dim_continuous
is actually not used so I removed it. Keras lazily initialize weights from the first batch of data in the forward pass, so there is no need to explicitly pass it to the constructor.
I added tests you can refer to here and here is an example of how the dataset should be to work with those implementations.
As for the input parameters, I assume you mean the hyper-parameters such as number of hidden units, etc... I would suggest starting small and increase the depth and hidden dimensions if the model overfits quickly. It's very data dependent so there is no one answer that will work for every dataset. Another way would be to find papers with similar data as yours and try with the hyper-parameters they are using.
Cheers
Hello, so all implementations in the repo expect the input to be a tuple of
(categorical inputs, continuous inputs)
of tensors of dimensions(batch size, num dim categorical/continuous)
. The argumentdim_continuous
is actually not used so I removed it. Keras lazily initialize weights from the first batch of data in the forward pass, so there is no need to explicitly pass it to the constructor.I added tests you can refer to here and here is an example of how the dataset should be to work with those implementations.
As for the input parameters, I assume you mean the hyper-parameters such as number of hidden units, etc... I would suggest starting small and increase the depth and hidden dimensions if the model overfits quickly. It's very data dependent so there is no one answer that will work for every dataset. Another way would be to find papers with similar data as yours and try with the hyper-parameters they are using.
Cheers
Thanks For ur reply! I use other version from github and sloved this problem. But still thanks a lot for your detailed instruction!
I have another question; I want to complete three task - two regression tasks (var: revenue, reputation) and one classification task (var: accept, can be 0 or 1) - with MMOE; and in my data, the revenue is 0 whenever the accept is 0; the revenue can be any value (negative, zero, positive) when accept is 1;
So my loss is total_loss = BCEloss + MSEloss + MSEloss; I put on weights to make sure they are similar scale.
My data is very skewed; (for revenue, over 60% value is 0, for reputation, over 50%; and for accept, the data points with value 0 are double the value 1)
The problem I encounter with loss is : (no matter in train or test data) (1) without normalizing the data, the loss did not decrease (2 with normalized data, the total_loss decreased but only due to the decrease in the loss for the classification task;
I put the results with normalized data as below:
May you help me with that? Have been stuck a lot! Thanks so much
Hi there!
I am trying to replicate the work in Chapter 3.2 in the original paper. the Synthetic Data is x, y1 and y2;
The shared usage code seems to have endless errors when I input the data and cannot fix it. I suppose it may be because the input is too simple. Would you like to share the code to run on the Synthetic Data?
Obvious problem exist where:
model = MultiGateMixtureOfExperts( dim_input=num_features, num_tasks=num_tasks, num_emb=num_embeddings, dim_emb=32, )
The input missing - dim_continuous; Also, how to decide the value of the input parameters?
Thanks so much!