mixOmicsTeam / mixOmics

Development repository for the Bioconductor package 'mixOmics '
http://mixomics.org/
159 stars 52 forks source link

Fixed ambiguous error in tune.spca() #166

Closed Max-Bladen closed 2 years ago

Max-Bladen commented 2 years ago

SOURCE: https://github.com/mixOmicsTeam/mixOmics/issues/161

PROBLEM TO BE RESOLVED: User has dataframe (72 x 48) with some NAs. When they attempt to use tune.pca(), works fine. When they attempt the below call:

grid.keepX<-c(seq(5,30,5)) tune.spca.result<-tune.spca(X, ncomp=3, folds=4, test.keepX=grid.keepX, nrepeat=10)

The follow error is raised: Error: Unexpected error while trying to choose the optimum number of components. Please check the inputs and if problem persists submit an issue ...

Worked when nrepeat = 2 or less, but more than that and it breaks.

SOLUTION: Adjusted lines 78-93 involved removing the rows of any X.train, X.test and t.comp.pred rows which had an NA within them. This is done within the repeat_cv_j function such that no data at the global level is lost. Allows cross product to be calculated within inducing any NA's

@aljabadi

aljabadi commented 2 years ago

Thanks @Max-Bladen, do you have any data that reproduces this behaviour? either from user or any today data. We will need to add a unit test that breaks before this fix but works fine after this fix. Would be great if you could add such a unit test so I can observe the behaviour in it.

Max-Bladen commented 2 years ago

Seeing as this PR was made on my fork before I had write access to the upstream repo, I will close this and reopen one via the official repo (such that checks can be run on it). Within the new PR, I'll include reprex results depicting the bug. Apologies for the inconsistency, still getting used to things