UCLouvain-CBIO / scp

Single cell proteomics data processing
https://uclouvain-cbio.github.io/scp/index.html
21 stars 2 forks source link

Improve `computeMedianCV` #7

Closed cvanderaa closed 3 years ago

cvanderaa commented 4 years ago

The implementation of computeMedianCV is messy:

cvanderaa commented 4 years ago

FYI: commit 713d93a9c662af89ee0c16efcb9f0caa077f5221 removes the SCoPE2 specific steps. This merely affects the replication results.

lgatto commented 4 years ago

I'm wondering if it wouldn't be worth considering a backward compatible mode if the difference is significant.

cvanderaa commented 4 years ago

Sorry, I realized I wrongly used the word merely. I meant there is almost no change so in my opinion the current implementation is generic and reproduces the SCoPE2 results accurately enough. However, the implementation still looks convoluted to me. Simplifying the function might lead to significant changes that might require backward compatibility indeed, although I think if it is to only reproduce SCoPE2 results, it might not be worth the effort. We could code the current implementation in the vignette as standalone function.

cvanderaa commented 4 years ago

I would think of a simpler implementation like the one in MSnbase:::featureCV, but we will have to figure out if this is suitable for single-cell data. Would you have an opinion about it?

Here are the current implementations of:

cvanderaa commented 3 years ago

Note to self:

cvanderaa commented 3 years ago

The https://github.com/UCLouvain-CBIO/scp/commit/bc3b1071c95996da512329d1feb4d2aaaa482a88 commit refactors the function computeMedianCV. I renamed it to computeMedianCV_SCoPE2. I consider this function as deprecated but keep it in the package for the moment for backward compatibility with the SCoPE2 analysis replication.

I created a new function medianCVperCell that should do more or less the same as computeMedianCV_SCoPE2 but in a more standardized way, with matrix operations instead of tidyverse functions. Some additional changes to the function:

This new function is based on an internal function featureCV that is inspired from the MSnbase implementation. The function takes a SingleCellExperiment object and a grouping variable, performs optionally a normalization procedure, computes CV and return a matrix with CVs. The function is not exported, but maybe this could be useful for example to use the CV to also filter feature.