JEFworks / MUDAN

Multi-sample Unified Discriminant ANalysis
http://jef.works/MUDAN/
GNU General Public License v3.0
72 stars 12 forks source link

Sparse rdata #3

Closed slowkow closed 6 years ago

slowkow commented 6 years ago

Save PBMC data matrices as dgCMatrix instead of matrix.

This reduces the data size in memory, so machines with less memory can still run the code without swapping to disk.

In my tests, this also reduces the time spent running devtools::check() by a small amount (about 15 seconds).

name             class      size in memory  size on disk
pbmcA            matrix             324 MB        2.4 MB
pbmcA            dgCMatrix           27 MB        2.3 MB
pbmcB            matrix             954 MB        5.8 MB
pbmcB            dgCMatrix           65 MB        5.4 MB
pbmcC            matrix            1230 MB        7.4 MB
pbmcC            dgCMatrix           84 MB        7.0 MB
referenceCounts  matrix             152 MB        2.3 MB
referenceCounts  dgCMatrix           31 MB        2.4 MB

pbmcA

> load("data/pbmcA.rda")
> class(pbmcA)
[1] "matrix"
> pryr::object_size(pbmcA)
324 MB
> pbmcA <- Matrix::Matrix(pbmcA)
> pryr::object_size(pbmcA)
27.4 MB

pbmcB

> load("data/pbmcB.rda")
> class(pbmcB)
[1] "matrix"
> pryr::object_size(pbmcB)
954 MB
> pbmcB <- Matrix::Matrix(pbmcB)
> pryr::object_size(pbmcB)
65.3 MB

pbmcC

> load("data/pbmcC.rda")
> class(pbmcC)
[1] "matrix"
> pryr::object_size(pbmcC)
1.23 GB
> pbmcC <- Matrix::Matrix(pbmcC)
> pryr::object_size(pbmcC)
83.9 MB

referenceCounts

> load("data/referenceCounts.rda")
> class(referenceCounts)
[1] "matrix"
> pryr::object_size(referenceCounts)
152 MB
> referenceCounts <- Matrix::Matrix(referenceCounts)
> pryr::object_size(referenceCounts)
30.6 MB
codecov-io commented 6 years ago

Codecov Report

Merging #3 into master will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@         Coverage Diff          @@
##           master    #3   +/-   ##
====================================
  Coverage       0%    0%           
====================================
  Files           2     2           
  Lines         645   646    +1     
====================================
- Misses        645   646    +1
Impacted Files Coverage Δ
R/mudan.R 0% <0%> (ø) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update e6b0894...b1f1b3c. Read the comment docs.