Open somi74 opened 1 year ago
@philipperemy please answer my question.
Hey @somi74 , ConditionalRecurrent
takes two matrices as input:
x
depends on time: (200 patients, 3000 data points of glucose, 1)c
does not depend on time: (200 patients, 40 columns)Define your model like that.
model = Sequential(layers=[
ConditionalRecurrent(GRU(128)),
# [...]
])
Use your data here.
x = np.random.uniform(size=(200, 3000, 1))
c = np.random.uniform(size=(200, 40))
y = np.random.uniform(size=(200, 3000, 1)) # for training, predict the next step. lots of info online how to do it.
Predict with your model
model.predict([x, c])
Thanks @philipperemy, but I have collected all the data according to their PtID in a dictionary. Here's how it is structured:
The first array, X, depends on time (glucose) with a sliding window of 6 (due to a prediction horizon of 30 minutes). The second array, y, represents the target. Lastly, there are other parameters that do not depend on time. I have extracted each of these lists to feed them into the model. The shapes of the arrays are as follows: X_train shape: (436165, 6,1) y_train shape: (436165,) condition shape : (188, 27) {I drop other columns which don’t relate to glucose then my columns reduce to 27, and 12rows have null value. I dropped it too.} However, when I try to use this library, I encounter a ValueError with the following message: model = Sequential(layers=[ ConditionalRecurrent(GRU(128)), Dense(units=1, activation='linear')] )
model.compile(optimizer='adam', loss='mae')
history = model.fit(x=[X_train, condition], y=y_train, epochs=10, batch_size=None, shuffle=True, validation_data=(X_val, y_val))
ValueError: Data cardinality is ambiguous: x sizes: 436165, 188 y sizes: 436165 Make sure all arrays contain the same number of samples.
I'm unsure how to fix this error.
size=(200, 3000, 1)
This size=(200, 3000, 1) of y means that the last linear layer has 3000 cells. Does it cause the model become too large?
(436165, 6,1)
You condition shape should be (436165, 188, 27)
I have a dataset about glucose for 200 patients, and I have some static data that doesn't relate to time. These static data are from a case form that every patient answers, it's about 40 columns and the rows are 200, because of the patients. and I have roughly 3000 rows for glucose for each patient. I want to predict glucose 30 minutes later. What should I do? How can I use this library for my work?"