Closed Raselel closed 1 year ago
Hey @Raselel, thanks for reporting that.
chr1:9808-10702
?Thanks for a quick reply.
chr1:9808-10702
has total_counts 180.0 Thanks! Seaborn 0.11.1 seems to work for the PMBC10k and a few other datasets.
I also tested the versions mentioned above in a clean environment (and pysam=0.18.0
), and the following still seems to work:
import muon as mu
from muon import atac as ac
atac = mu.read("brain3k_processed.h5mu/atac")
ac.pl.fragment_histogram(atac, region='chr1:9808-10702')
Also maybe clarifying a few things about the code above will help to understand the issue better.
Is atac
above actually a MuData or an AnnData object? While the processing makes sense for an AnnData object, the reading function probably spits out a MuData?
mu.read(path + "output/file_postrnatutorial.h5mu")
And then, if it's an AnnData with e.g. peak counts and chr1:9808-10702
is in .var
, there are also probably no var_names
in it starting with MT-
?
atac.var['mt'] = atac.var_names.str.startswith('MT-')
I think you are right about the object: the atac.var does not have genes in var_names
This is the structure of the object, if helps.
Should I read the file differently?
Thanks!
Hey @Raselel, to get back on this, I wasn't able to reproduce it on several scATAC-seq / multiome datasets. Was this issue resolved in any way? If not, should we figure out a way to share a portion of this dataset with us so that we can debug it?
I had this same bug when I had an ATAC anndata object where I removed the "-1" from the end of the barcodes within adata.obs.index (i.e. barcode name AAACAGCCAACTAACT-1 changed to AAACAGCCAACTAACT). Adding back in the "-1" to the names in adata.obs.index fixed the problem.
Thanks, @connersk, we'll try to reproduce that and fix it!
Indeed, the reason seems to have been that the cell identifiers were renamed. The fragments file contains cell identifiers as well, and together with the ones in the in-memory object, those are used to link fragment -> cell information.
The solution for this is already implemented however: functions such as ac.pl.fragment_histogram
and ac.tl.nucleosome_signal
have the barcodes
parameter. Generally, it makes sense to keep the original barcode in the metadata, and with barcodes="orig_barcode"
it can be used for fragments-related functionality.
This will be available for all the functions operating on fragments in the next versions.
Hello, thanks for muon!
I was trying to reproduce the steps described in the ATAC tutorial (https://muon-tutorials.readthedocs.io/en/latest/single-cell-rna-atac/pbmc10k/2-Chromatin-Accessibility-Processing.html) on my independent dataset, but I get an error from the ac.pl.fragment_histogram function.
The code:
The error is:
The correct fragment file is stored in atac.uns['files'] and I've installed the relevant dependencies with pip install 'muon[atac]'. Any idea what the issue might be?
System
Thanks! Rasa