Thanks for this great tool! We're using it to dig into our scATAC-seq data from mouse VISp.
This pull has updates to make_cicero_cds() that I think will speed up this step of the process - mostly by keeping things as numeric/matrix as much as possible to reduce type switching.
Below is timing using the original version (make_cicero_cds() ) compared to this version (renamed faster_cicero_cds() ) for my data using chr18.
Also below are the results from the runCicero tests, though I don't know the cds system super well, so I'm not 100% sure this doesn't break anything downstream. If there's something that this breaks, please let me know, and I'll modify and recommit to this branch.
Cheers,
-Lucas Graybuck
Simple benchmark
> # make cicero cds
> set.seed(2018)
> start_time <- Sys.time()
> cicero_cds <- make_cicero_cds(input_cds,
+ reduced_coordinates = tsne_coords,
+ k = 20)
Overlap QC metrics:
Cells per bin: 20
Maximum shared cells bin-bin: 17
Mean shared cells bin-bin: 0.181565206801541
Median shared cells bin-bin: 0
Removing 306 outliers
> end_time <- Sys.time()
> difftime(end_time, start_time)
Time difference of 16.1512 mins
>
> source("faster_cicero_cds.R")
> set.seed(2018)
> start_time <- Sys.time()
> cicero_cds <- faster_cicero_cds(input_cds,
+ reduced_coordinates = tsne_coords,
+ k = 20)
Overlap QC metrics:
Cells per bin: 20
Maximum shared cells bin-bin: 17
Mean shared cells bin-bin: 0.181565206801541
Median shared cells bin-bin: 0
Removing 306 outliers
> end_time <- Sys.time()
> difftime(end_time, start_time)
Time difference of 2.781479 mins
Testing
> devtools::test(filter = "runCicero")
...Package loading messages ommitted...
Testing cicero
√ | OK F W S | Context
/ | 28 | runCicero[1] "Successful cicero models: 283"
[1] "Other models: "
Zero or one element in range
30
[1] "Models with errors: 0"
\ | 46 | runCicero[1] "Coaccessibility cutoff used: 0.25"
√ | 78 | runCicero [92.7 s]
== Results =====================================================================
Duration: 92.8 s
OK: 78
Failed: 0
Warnings: 0
Skipped: 0
Hi Hannah and the CICERO team,
Thanks for this great tool! We're using it to dig into our scATAC-seq data from mouse VISp.
This pull has updates to make_cicero_cds() that I think will speed up this step of the process - mostly by keeping things as numeric/matrix as much as possible to reduce type switching.
Below is timing using the original version (make_cicero_cds() ) compared to this version (renamed faster_cicero_cds() ) for my data using chr18.
Also below are the results from the runCicero tests, though I don't know the cds system super well, so I'm not 100% sure this doesn't break anything downstream. If there's something that this breaks, please let me know, and I'll modify and recommit to this branch.
Cheers, -Lucas Graybuck
Simple benchmark
Testing