AI4HealthUOL / SSSD

Repository for the paper: 'Diffusion-based Time Series Imputation and Forecasting with Structured State Space Models'
MIT License
270 stars 47 forks source link

inference with future data? #3

Closed hawkcl closed 1 year ago

hawkcl commented 1 year ago

Dear Juan,

I really like your work. However, I think you are probably using the masked data for both training and inference.

For example, image The initial noise is actually having the masked data. And overhere image in the function SSSDS4Imputer.forward, the conditional is the masked part we are trying to predict.

Please clarify my confusions. Thanks a lot!

-Lei

juanlopezcode commented 1 year ago

Dear Lei From the original mask, 0 represents the parts to be imputed, and 1 the parts of the signal to remain as conditional.

In line 151: From the left-hand side, 1-mask reverses the mask, so when we multiply the noise x by the reversed mask we obtain a batch with 0’s in the conditional part and noise in the part to be imputed.

From the right-hand side when the conditional multiplies the original mask, we obtain a batch with 0’s in the part to be imputed and conditional in the conditional part.

Thus, when adding them, we obtain a batch with conditional information and noise in the place to be imputed.

Consequently, in line 189, the parts to be imputed are set to 0’s and the conditional information remains in the batch as it is multiplied by 1.

I hope this clarifies your question. Regards

hawkcl commented 1 year ago

Dear Juan,

Thanks for the reply. It is a very nice and clear explanation. I was confused with the mask, somehow I remembered I saw the mask with a value of 0's followed 1's for the bm mode. When I double check, it is all good now :).

One more question for line 151 above. What do you think if we change the cond with a transformed_X like the one in training?

Have a nice weekend! Lei