mixOmicsTeam / mixOmics

Development repository for the Bioconductor package 'mixOmics '
http://mixomics.org/
157 stars 52 forks source link

Fix for Issue #268 #269

Closed Max-Bladen closed 1 year ago

Max-Bladen commented 1 year ago

Adjusted lines relating to nzr in Check.entry.wrapper.mint.block() function. If the nzr$Position object had non-zero length, features would be removed if it wasn't a DA framework AND if it was operating on the Y dataframe - ie. if block.(s)pls Y dataframe. This means that it wasn't applied to X blocks in block.(s)plsda contexts. The nzv filtering should only NOT be applied to the Y dataframe in DA frameworks.

This was changed to checking if there were any nzr features - if not then block is skipped. If its a DA framework AND its the Y dataframe, the block is skipped. Otherwise, the filtering is applied.

This introduced downstream issue in predict() called via auroc(). nzr filtering is applied to newdata, which by default is equal to object$X. If nzr is non-null for a block, the filtering is applied for newdata. This could result in filtering being applied twice with unadjusted indices, meaning high variance features may be removed accidentally.

Hence, a check was implemented in predict(). The feature names of a block are checked against the feature names in the nzr object. If the nzr features are not found in block, filtration is NOT applied.

Additional check at the end of Check.entry.wrapper.mint.block() added for safety. Ensures there are no zero variance features remaining. If so, function is stopped.