PennLINC / ModelArray

ModelArray: an R package for statistical analysis of fixel-wise data and beyond
https://pennlinc.github.io/ModelArray
BSD 3-Clause "New" or "Revised" License
5 stars 2 forks source link

how to deal with missing data #74

Closed pinghongyeh closed 1 year ago

pinghongyeh commented 1 year ago

Hi,

My data has missing values in covariates. Can ModelArray.gam deal with missing data?

Thanks. Ping

zhao-cy commented 1 year ago

Hi Ping,

Thank you for your interest in ModelArray! Yes ModelArray.gam() can handle missing data NA in covariates. The default behavior handling NA in ModelArray.gam() uses the default in option na.action in mgcv::gam() - please refer to documentation of mgcv::gam() for details. So feel free to give it a try, and please let me know if you have further issues or questions?

Thanks, Chenying

zhao-cy commented 1 year ago

Apologize I forgot to say that, per issue #42 , please avoid requesting calculation of changed.rsq (i.e., by requesting changed.rsq.term.index) for terms that have missing values. If you do want to do so, you might consider excluding those participants with NA before calling ModelArray.gam(). Let me know if you have questions?

Thanks, Chenying

pinghongyeh commented 1 year ago

Hi Chenying,

Thanks for the reply. I have set up ModelArray.gam by removing changed.rsq.term.index and it has been running since yesterday. I guessed ModelArray.gam would skip those with data with missing values, correct? Best, Ping

On Mon, Jan 23, 2023 at 11:54 AM Chenying Zhao @.***> wrote:

Apologize I forgot to say that, per issue #42 https://github.com/PennLINC/ModelArray/issues/42 , please avoid requesting calculation of changed.rsq (i.e., by requesting changed.rsq.term.index) for terms that have missing values. If you do want to do so, you might consider excluding those participants with NA before calling ModelArray.gam(). Let me know if you have questions?

Thanks, Chenying

— Reply to this email directly, view it on GitHub https://github.com/PennLINC/ModelArray/issues/74#issuecomment-1400666420, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA43XNQN5M4L2CZXPSGO2UDWT2ZSRANCNFSM6AAAAAAUA66UNI . You are receiving this because you authored the thread.Message ID: @.***>

zhao-cy commented 1 year ago

Hi Ping,

As ModelArray.gam() uses default behavior in mgcv::gam() to handle missing values, I referred to the latter's documentation. I think what you said is correct, it will "use only the ‘complete cases’" if there are missing values in the covariates in the GAM model. You could also try out mgcv::gam() with some toy data with missing values if you hope to confirm.

Below are the links from the latest version of mgcv (1.8-41) I referred to. Please check the version of package mgcv you’re using and refer to the corresponding version’s documentations.

Best, Chenying

zhao-cy commented 1 year ago

Hi Ping,

I'm closing this issue here. If you have additional questions regarding the same topic, please reopen it and ask here; if there are new questions regarding other stuff, please create a new issue.

Thanks, Chenying