Open Biomiha opened 1 year ago
You are very right. I though about it many times. We can add it to the tidyomics challenges. Are you aware of the page?
I was not before but I am now :) Thanks!
I was not before but I am now :) Thanks!
Great, it is here https://github.com/orgs/tidyomics/projects/1/views/1
Let's see if someone wants to commit to that. If you like the challenge and want to be part of our almost-ready-to-submit paper on tidyomics, feel free to propose yourself!
Sure, I'd be happy to contribute.
Amazing. I think two aspects are surely the header of tibble representation and the join_feature. maybe some care should be given when there are assays with repairs named how the front end would look like.
as user what you feel is missing? what tidy operations would you like to do with the alternative Experiment?
I think for starters there needs to be at least a mention in the print method that the altExp is not empty. As it is currently in tidySCE
you have no way of knowing and have to go specifically looking for it.
In terms of operations it depends on what technology was used for the altExp. If this was an antibody-tag experiment (as is most often the case) as far as I am concerned once the counts have been denoised and normalised the same operations can be used as for the main exp slot. If scaling were not an issue they could arguably even be used as additional features in the standard Experiment slot. Most people use them to plot (UMAP, ridgeplots, etc...) or to subcluster and refine existing clusters.
I can't say I know enough about scATAC-seq and other platforms to judge but they seem pretty different so having a separate slot is advantageous. The main benefit of the sce object class in that regard is that you can subset individual or groups of cells and keep the underlying structures intact.
Ok let's start from adding altexp:assay_name in the header. How about if there are multiple altexp? we could do altexp[[1]]:assay_name? do altexp have names usually?
Yes, the altExp
slot will have a name if it is populated and the structure within the altExp slot is the same as a normal SCE
object (which in fairness is a bit confusing at times).
Using the example from the OSCA book: (http://bioconductor.org/books/3.14/OSCA.advanced/integrating-with-protein-abundance.html), this is what the standard output looks like:
library(DropletTestFiles)
path <- getTestFile("tenx-3.0.0-pbmc_10k_protein_v3/1.0.0/filtered.tar.gz")
dir <- tempfile()
untar(path, exdir=dir)
'# Loading it in as a SingleCellExperiment object.
library(DropletUtils)
sce <- read10xCounts(file.path(dir, "filtered_feature_bc_matrix"))
sce
>> class: SingleCellExperiment
>> dim: 33538 7865
>> metadata(1): Samples
>> assays(1): counts
>> rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475 ENSG00000268674
>> rowData names(3): ID Symbol Type
>> colnames: NULL
>> colData names(2): Sample Barcode
>> reducedDimNames(0):
>> mainExpName: Gene Expression
>> altExpNames(1): Antibody Capture
altExp(sce)
>> class: SingleCellExperiment
>> dim: 17 7865
>> metadata(1): Samples
>> assays(1): counts
>> rownames(17): CD3 CD4 ... IgG1 IgG2b
>> rowData names(3): ID Symbol Type
>> colnames: NULL
>> colData names(0):
>> reducedDimNames(0):
>> mainExpName: NULL
>> altExpNames(0):
Cool. Of course, the philosophy of our interface is being modular rather than recursive (such as SCE inside and SCE).
great, so if header and join_features are the only additions, I think it is pretty straightforward. When we choose assay to join the feature from, we look both in regural and alternative experiments.
After this maybe we can think about, multiple PCA, UMAP etc..
@Biomiha please add your authorship details here https://docs.google.com/spreadsheets/d/19XqhN3xAMekCJ-esAolzoWT6fttruSEermjIsrOFcoo/edit?usp=sharing
I suppose I should actually contribute first, no :)?
I suppose I should actually contribute first, no :)?
Yes! as soon you manage to do a PR feel free to add yourself.
Hi Stefano,
Quick question if I may? I am new to pillar and have found the print method and the utilities files in the repo but can't seem to find where setup
lives to change tbl_sum. Am I just being thick?
Thanks, Miha
Did you try to look for "setup" in all .R files?
I have to say that I am not an expert of pillar
either, and I reversed engineered mostly. pillar
became better recently so we can use a lot of low-level functions directly.
Were you able to orient yourself in the print method, where I modify the header of the "tibble"?
I've looked a bit yes but again I am new to modifying printing methods for tibbles so could very well be looking in the wrong place.
I have been able to find the tbl_format_header.tidySingleCellExperiment
and the print.SingleCellExperiment
functions. I am able to tweak them but it seems I can only change the values and not the names, e.g. I can change the number of rows that are printed for the Features
specification but not the word Features
. I've tried changing it to Creatures
but no joy :).
As far as I can tell from the very nice and detailed description on the pillar
website (https://pillar.r-lib.org/articles/printing.html) I would need to change tbl_sum
that lives in tbl_format_setup
.
I'll do some more digging when I get a bit of time.
The fact that I could add "feature" means that you can change that :)
Yes, that was the one I was looking at. I think I have been able to figure it out. Should submit a PR in the next couple of days.
Hi all,
I am really liking your package, it has made many operations infinitely easier. We tend to prefer the
tidySingleCellExperiment
to Seurat, however, the one thing we have noticed is that there is no functionality to access thealtExp
slot, where our CITE-seq data are stored. In the standardSingleCellExperiment
print method thealtExpNames
are listed at the bottom, whereas this is completely hidden in the tibble abstraction. Could this possibly be added?Many thanks in advance.