Missing data - Githubissues

ucl-pond / pySuStaIn

Subtype and Stage Inference (SuStaIn) algorithm with an example using simulated data.

MIT License

122 stars 63 forks source link

Missing data #1

Closed noxtoby closed 4 years ago

noxtoby commented 5 years ago

I get divide by zero errors relating to model likelihood, which I tracked back to missing data causing problems with max() and min(), etc. Couldn't fix it with numpy.nanmax(), so we probably need to devise a robust method for handling missing data.

ayoung11 commented 5 years ago

SuStaIn doesn't handle missing data at the moment, I have an idea about one way to go about it but it needs testing first

noxtoby commented 5 years ago

Yeah, I had a think about how to do it without biassing the model, but it's not as straightforward as it is for the EBM.

armaneshaghi commented 4 years ago

I close this as we have ongoing work on missing variable analysis in SuStaIn at the POND group, to be introduced later.

d-morrison commented 1 year ago

Any updates on handling missing data?

noxtoby commented 1 year ago

@d-morrison — see ZScoreSustainMissingData if using the z score model. If using mixture SustaIn, then you have to handle missing data yourself when calculating the event likelihoods that go into pySuStaIn.