Closed hgjlee closed 4 years ago
Hi @hgjlee! It's very interesting that you're working on Cox regression with LSTMs! But I'm not sure I fully understand what your objective here is.
If I understand correctly, you have a regular two-dimensional x_train
with each row representing an individual and each column representing a covariate/variable/feature. And your LSTM then makes predictions for each individual using a latent state that "encodes" information about the previous individuals in the batch (the LSTM iterates over individuals)? If that is the case, how do you decide the ordering of the individuals (rows of x_train
)?
Keep in mind that I might just have misunderstood what you're doing.
Thank you for your reply! That's pretty close. I'm trying to make a sequence for each individual and have LSTM run on the sequences separately. And the states should exchange within each individual sequence rather than between the individuals. I hope this is a clearer explanation.
Ah, I understand! In that case I agree with the approach. It makes total sense to let an LSTM iterate over the features for each individual. I'm assuming you features are some sort of time-series?
However, I still can't wrap my head around that this actually happens (it's been a while since I last worked with RNNs). According to the pytorch docs the input
to an LSTM should be of shape (seq_len, batch, input_size)
but your input is defined as input = input.view(len(input), 1, self.embedding_dim)
. Doesn't this mean the sequence your LSTM runs on is the rows of x_train
(which I assume represent each individual)? Or does each column of x_train
represent a sequence of variables for an individual?
Could you give an example of x_train
so it would be simpler to understand this?
If x_train
is two-dimensional and you want the LSTM to run through the features of each individual, doesn't that mean your embedding_dim
should be 1? And then your input should have the shape (embedding_dim, len(input), 1)
?
Yes, you're right. I'm trying to introduce time with this approach.
And that's exactly what I'm trying to make sure of right now. So let's say that the embedding size is 3 and the sequence length is 2. I'd have a list of lists of tuples as such: [[(1,1,1), (2,2,2)]]. Each index would represent an individual.
In the above case, I'm thinking this instead:
input = input.view(2, 1, 3)
since an individual has a seq length 2, the batch size is 1, and the embedding size 3.
To make sure I'm not misunderstanding, in [[(1,1,1), (2,2,2)]] do you have 2 or 1 individual? If you have 1 individual, I agree with you.
that would be one individual with two features of embedding size 3. Great. Thanks for sharing your thoughts! That was helpful.
Great! Looks like you have everything under control! Hope you'll get the opportunity to share your results with us at some point in the future!
In order to use LSTM instead of MLPVanilla with the CoxTime and CoxPH models, I have the following model class. It works mechanically, but I want to make sure that the implementation is theoretically correct. I'm trying to make each patient the input sequence for the LSTM model and the hidden and cell states can be transferred within that sequence, not on the whole batch of patients as a sequence. Would you be able to share some insights?