alanocallaghan / scater

Clone of the Bioconductor repository for the scater package.
https://bioconductor.org/packages/devel/bioc/html/scater.html
94 stars 38 forks source link

MAGIC adaptive kernel #21

Closed davismcc closed 6 years ago

davismcc commented 6 years ago

From @dvdijk on April 18, 2017 23:52

I saw that there is a MAGIC implementation in Scater, however it doesn't use the adaptive kernel. The adaptive kernel is important for MAGIC to work well. I haven't coded in R for a long time but I can advise on how to correctly implement MAGIC.

Copied from original issue: davismcc/archive-scater#106

davismcc commented 6 years ago

Hi David

thanks for getting in touch! After reading the paper I was keen to try it out, so coded up that simple implementation. As you rightly point out, it's missing the adaptive kernel.

This R implementation is not "release ready", so I have not exported the MAGIC functions in the latest devel versions of the pkg (i.e. not visible to the user), which will be released in the next Bioconductor release. (I'd also been meaning to get in touch with you, but things got away from me.)

It would be great to have your advice on correctly implementing MAGIC. I think it would be excellent to have an R version available to people, and should enable more people to use your method.

Best Davis

davismcc commented 6 years ago

From @dvdijk on April 19, 2017 18:2

Hi Davis,

Here's a Matlab implementation of MAGIC that doesn't require an external diffusion map package:

%% params k = 30; ka = 10; npca = 20; t = 6;

%% MAGIC [U,~,~] = randPCA(M', npca); % fast random svd (http://rbigdata.github.io/documentation/pbdML/rpca.html) %[U,~,~] = svds(M', npca); M_pc = M U; % PCA project D = squareform(pdist(M_pc)); % compute distances knnD = sort(D); % get kNN th = knnD(k+1,:); % find knn cutoff ind_zero = D > repmat(th',1,size(D,2)); % kNN cutoff sigma = knnD(ka+1,:); % locally adaptive sigma D = bsxfun(@rdivide,D,sigma); % adapt distances W = exp(-D.^2); % gaussian kernel W = W + W'; % symmetrize W(ind_zero) = 0; % kNN cutoff L = bsxfun(@rdivide, W, sum(W,2)); % Markov normalization L_t = L^t; % diffuse M_new = L_t M; % impute

%% rescale stuff here (you already have that)

Above code should be fairly easily ported to R. As you can see I'm using a fast random PCA implementation (really a lot faster and very accurate). I found an R implementation, see link in comments. The input data M has cells on the rows and genes on the columns. Let me know if I can further help.

David

davismcc commented 6 years ago

From @rspreafico on May 4, 2017 23:52

Pretty cool, it'd be great to have the R equivalent of this implementation available in scater.

davismcc commented 6 years ago

From @dvdijk on May 5, 2017 4:8

When I have time I'll try to implement this in R. I'll post it here when I do.

dpcook commented 6 years ago

Hey Davis--the MAGIC crew just finished up an R implementation over on their github page. May be easier to throw it in scater now if there's still interest.