Open micahtratt opened 3 months ago
Hi there! Thanks for your interest in using our package. A few things I noticed that I'm wondering if you could look into prior to re-testing:
assay(ATAC.se)[1:5,1:5]
assay(ATAC.se','counts')[1:5,1:5]
granges(ATAC.se)
You want to make sure these are raw peak x cell counts, and nothing else. Can you show me the original call to the main function?
Your RNA matrix has negative values, which to me seems like it's scaled somehow? We expect as input normalized expression levels, not scaled values (this won't really affect the correlation computation, but just want to make sure nothing strange is happening under the hood because of it).
Which genome build are you specifying here?
This is a strange error given the data size, it shouldn't really have to do with memory (thank you for testing appropropriately via downsampling etc. - I'd say also worth testing on a single core just in case something strange is happening with the parellization that might be harder to debug here). FigR does not save any abnormally large output either at this step.
Let's see if we can debug this based on the above output?
Hi,
Thanks for the reply @vkartha ! I adjusted the RNA and ATAC to be raw counts (I believe they were scaled by z-score previously). I am still getting the same memory/core related error and have tried running with a single core to rule out parallelization.
Any information or suggestions you have is appreciated!
`cisCorr <- FigR::runGenePeakcorr(ATAC.se = ATAC.se,
RNAmat = RNAmat,
genome = "hg38",
nCores = nCores,
p.cut = NULL,
n_bg = 1)`
` > assay(ATAC.se)[1:5,1:5]
5 x 5 sparse Matrix of class "dgCMatrix"
AAACCAGGTGCGCAAGACAGACCT-1 AAACCGGTCCAGAACGACAGACCT-1 AAACGTTCATTTCTTCACAGACCT-1
chr1-180734-181683 . . .
chr1-183935-184770 . . .
chr1-191052-191934 . . .
chr1-629412-630393 . . .
chr1-631285-632180 . . .
AAAGATGCAATGGGAGACAGACCT-1 AAAGCATGTAACCGCCACAGACCT-1
chr1-180734-181683 . .
chr1-183935-184770 . .
chr1-191052-191934 . .
chr1-629412-630393 . .
chr1-631285-632180 . .
`assay(ATAC.se,'counts')[1:5,1:5]
AAACCAGGTGCGCAAGACAGACCT-1 AAACCGGTCCAGAACGACAGACCT-1 AAACGTTCATTTCTTCACAGACCT-1
chr1-180734-181683 0 0 0
chr1-183935-184770 0 0 0
chr1-191052-191934 0 0 0
chr1-629412-630393 0 0 0
chr1-631285-632180 0 0 0
AAAGATGCAATGGGAGACAGACCT-1 AAAGCATGTAACCGCCACAGACCT-1
chr1-180734-181683 0 0
chr1-183935-184770 0 0
chr1-191052-191934 0 0
chr1-629412-630393 0 0
chr1-631285-632180 0 0
`> granges(ATAC.se)
GRanges object with 93434 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
chr1-180734-181683 chr1 180734-181683 *
chr1-183935-184770 chr1 183935-184770 *
chr1-191052-191934 chr1 191052-191934 *
chr1-629412-630393 chr1 629412-630393 *
chr1-631285-632180 chr1 631285-632180 *
... ... ... ...
chrY-56868535-56869375 chrY 56868535-56869375 *
chrY-56869529-56870353 chrY 56869529-56870353 *
chrY-56870707-56871623 chrY 56870707-56871623 *
chrY-56873514-56874309 chrY 56873514-56874309 *
chrY-56879634-56880385 chrY 56879634-56880385 *
-------
seqinfo: 24 sequences from an unspecified genome; no seqlengths
Hello, I am getting the following error when running the function.
Whether I run 250 or 3 background peaks, I get this error. I have tried running this in different environments with different number of features (filtering the ATAC peaks from 93,000 to 10,000 and RNA genes from 16,000 to 10,000) for the 1,800 cells. I don't believe the output should be sufficiently large to be causing this error. I have approximately 200GBs of memory and 19 cores available.
Is this an error that others are running into? Or is the output file truly larger than 200GBs?!
Thanks!