Closed rpolicastro closed 6 months ago
Dear Bob (@rpolicastro),
Thank you very much for your suggestion, we'll certainly look carefully at it and will try to incorporate it into the code base. I'll keep you posted here.
Dear Bob (@rpolicastro),
As the previous automated messages suggest, your suggestion to improve performance in the function .fastRndWalk()
has been implemented in the latest release of GSVA (1.52.x), which came out on May 1st, 2024, although using R code only, which in our benchmarkings was bringing a comparable improvement in performance as the Rcpp or C counterpart, i.e., running one order or magnitude faster and consuming one order or magnitude less memory. Thanks again for bringing up this performance bottleneck, which has been now greatly reduced.
Fantastic, I'm glad it worked out! Taking a peak at the code changes the R fix was rather simple and elegant. I'm impressed that's all it ended up taking.
Cheers, Bob
Hello,
For ssGSEA on scRNA-seq data it appears the code is running the
.fastRndWalk
functionn_cells * gene_sets
number of times. I was curious whether moving this function to C++ could speed up this operation (and potentially make it more memory efficient) so I roughly reimplemented the function using Rcpp with a few minor changes.Hijacking your vignette code to make some example data.
Preparing the data to run the old and new functions.
The R implementation of
.fastRndWalk
.Here's the Rcpp implementation of
fasterRndWalk
.Benchmarking the two implementations.
The C++ implementation is almost 4 times faster and uses about 100 times less memory.
The results are slightly different.
My C++ is rusty (because of Rust) and I know very little C, so I imagine someone else could improve this further or reimplement it in C and avoid any more dependencies. I'm not too proud to admit that I needed ChatGPT to debug a line of code for me here.
Some relevant versions.
Cheers, Bob