philipperemy / cond_rnn

Conditional RNNs for Tensorflow / Keras.
MIT License
225 stars 32 forks source link

Question on processing time-series text data #29

Closed chengyineng38 closed 2 years ago

chengyineng38 commented 2 years ago

Thanks for the example implementation and making this into a library! Very helpful!

I am writing to see if you have any suggestions on processing time-series text data? I haven't been able to find much help in this area since most examples only show time series numerical data. Specifically, do you have any ideas on how we can process sequences of text data for a patient?

E.g. A patient can have multiple doctors' notes attached to the patient's profile; each doctor's note occurs at a different timestamp. As a result (and expectedly), each patient in the data can have wildly different timestamps. The order of the doctor's notes is important. More recent doctor notes are more important than older ones. It's also important to keep track of the patient ID. The goal is to estimate mortality risk for the patients in the next year.

philipperemy commented 2 years ago

@chengyineng38 You need to consider embeddings. That's the first thing.

The second thing is: you can always concatenate all the notes for a given patient together.

And regress that to estimate the mortality risk. I guess that's the easiest you can do. And you don't even need a CondRNN for that. A simple LSTM + Embedding layer will work well.