Closed mvfki closed 4 years ago
Thanks for the suggestion, I will work on this to incorporate it for the next release.
Hi @mvfki, this was fixed/implemented a while ago, but here is the updated behaviour of the feature you requested in supplying a batch_name
argument rather than forcing a column called "batch".
If you are happy, please close the issue.
library(scMerge)
library(SingleCellExperiment)
#> Loading required package: SummarizedExperiment
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> Loading required package: parallel
#>
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:parallel':
#>
#> clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
#> clusterExport, clusterMap, parApply, parCapply, parLapply,
#> parLapplyLB, parRapply, parSapply, parSapplyLB
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> anyDuplicated, append, as.data.frame, basename, cbind, colnames,
#> dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
#> grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
#> order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
#> rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
#> union, unique, unsplit, which, which.max, which.min
#> Loading required package: S4Vectors
#> Warning: package 'S4Vectors' was built under R version 3.6.3
#>
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:base':
#>
#> expand.grid
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Warning: package 'GenomeInfoDb' was built under R version 3.6.3
#> Loading required package: Biobase
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#> Loading required package: DelayedArray
#> Warning: package 'DelayedArray' was built under R version 3.6.3
#> Loading required package: matrixStats
#>
#> Attaching package: 'matrixStats'
#> The following objects are masked from 'package:Biobase':
#>
#> anyMissing, rowMedians
#> Loading required package: BiocParallel
#>
#> Attaching package: 'DelayedArray'
#> The following objects are masked from 'package:matrixStats':
#>
#> colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges
#> The following objects are masked from 'package:base':
#>
#> aperm, apply, rowsum
## Loading example data
data('example_sce', package = 'scMerge')
## Previously computed stably expressed genes
data('segList_ensemblGeneID', package = 'scMerge')
## Running an example data with minimal inputs
example_sce$data_name = example_sce$batch
colData(example_sce) = colData(example_sce)[,"data_name", drop = FALSE]
colData(example_sce)
#> DataFrame with 200 rows and 1 column
#> data_name
#> <factor>
#> ola_mES_a2i_2_48.counts batch2
#> ola_mES_2i_2_75.counts batch2
#> ola_mES_lif_2_68.counts batch2
#> ola_mES_a2i_2_42.counts batch2
#> ola_mES_2i_2_66.counts batch2
#> ... ...
#> ola_mES_2i_3_17.counts batch3
#> ola_mES_lif_3_27.counts batch3
#> ola_mES_2i_3_21.counts batch3
#> ola_mES_a2i_3_49.counts batch3
#> ola_mES_2i_3_54.counts batch3
sce_mESC <- scMerge(sce_combine = example_sce,
ctl = segList_ensemblGeneID$mouse$mouse_scSEG,
kmeansK = c(3, 3),
assay_name = 'scMerge', batch_name = "data_name")
#> Dimension of the replicates mapping matrix:
#> [1] 200 3
#> Step 2: Performing RUV normalisation. This will take minutes to hours.
#> scMerge complete!
Created on 2020-05-30 by the reprex package (v0.3.0)
Cool. It looks great now. Thanks for working on it!
Hi,
This is definitely not a bug, but just a slight suggestion for possible improvement.
In function
scMerge::scMerge()
line 149. it is written like: a column called exactly "batch" must be present. I can easily modify the name of the column I want to use to "batch" before I throw it into your function. But in practice, if I write a wrapper function for your method, I don't think it is really okay to do so that the user would receive unexpected modifications. And if I insist on doing so, I'd have to back up either the original SCE object or the original colname. I mean...why not do a double bracket for column selection with an extra argument allowing people to specify the annotation name?Sorry for bothering.