Open rcorces opened 2 years ago
@wuv21 - I converted this to an Issue post.
I think the most important question is why do mean1 and mean2 get set to NA in the first place? Any chance you can parse that out? I'd rather try to figure out how to prevent the NAs from getting generated than work around their existence.
Thanks @rcorces for moving it into issues and for your reply. I dug deeper and found that this particular motif had no matches in my dataset. I did the following:
# load in the matches rds from getPeakAnnotation()
tmp <- readRDS(getPeakAnnotation(proj, "Motif")$Matches)
# below line returns FALSE; i tried other columns corresponding to other motifs and saw that there were TRUE values
any(tmp@assays@data@listData[["matches"]][, 156])
# double checked to make sure that there were no positions either in motif 156
# other motifs had elements in the GRanges objects
tmp <- readRDS(getPeakAnnotation(proj, "Motif")$Positions)
tmp[[156]]
GRanges object with 0 ranges and 1 metadata column:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <numeric>
-------
seqinfo: 23 sequences from an unspecified genome; no seqlengths
As such, .getPartialMatrix()
returns back a matrix that has NA values associated with that particular motif when it is called in .testMarkersSC()
.
I think this might just be a rare case of this motif not being visible in this current project/biological context (possibly due to the lower cell counts for this current analysis). This error doesn't occur for me with other projects but I do notice that this motif has the smallest number of peaks associated with it compared to all the other motifs.
I have temporarily patched this on a new branch (dev_idxFilter
) via https://github.com/GreenleafLab/ArchR/commit/f091e426bbe6f78b149ded4f9b49e86ad4f5640f but I think the better fix is to remove these offending rows from the motif matrix entirely. Will try to address this more thoroughly soon.
Discussed in https://github.com/GreenleafLab/ArchR/discussions/1319