GreenleafLab / ArchR

ArchR : Analysis of Regulatory Chromatin in R (www.ArchRProject.com)
MIT License
384 stars 137 forks source link

How to extract motif regions in whole genome or accessible regions in each cell/cluster? #334

Closed seishin0125 closed 4 years ago

seishin0125 commented 4 years ago

I performed the motif enrichment analysis and motif deviation analysis following the codes written in Chapter12 and 13. Motifs whose accessibility was enriched in each cluster were successfully determined, however, I couldn't extract the "accessible" motif regions in each cell/cluster from ArchR project files.

This information would be greatly helpful to assume the downstream target of motifs, therefore, could you teach me how to extract the "accessible" motif regions?

※I suspected that this information might be contained in rds files (Motif-In-Peaks-Summary.rds, Motif-Matches-In-Peaks.rds, Motif-Positions-In-Peaks.rds) created in Saved_ArchR_project/Annotations directry. However, I could not find the lists or matrices correspond to this information by loading these rds files and used str() function.

jgranja24 commented 4 years ago

Umm you can identify the differential peaks per cluster and then you can subset those peaks from the motifmatches. Try getMatches() to get a peak x motif occurence matrix and then subset it with GRanges manipulations or by pasting the ranges i.e paste0(seqnames(gr),"_",start(gr),"_",end(gr)) and doing string intersection.

jgranja24 commented 4 years ago

I am closing this issue, feel free to re-open if you need further assistance.

NinaJiangL commented 3 years ago

hi, I have the same issue, can you explain a bit more clear? Once I got the enrichmed motif from the marker peaks , I can plot the heatmap and the ggplot for showing the most enriched motif from samples and clusters, but not sure how to pull the list of these specific motifs and its corresponiding sequences (logo) and location. (As I w

Screen Shot 2021-02-25 at 10 22 56 PM

ant to get the motif prediction results like from Homer2 ) Also, I am confused about the lablels of the motif e.g. CEBPA_185 what is the 185 stand for ?

HaixJiang commented 4 months ago

@seishin0125 Hi, I want to screen the motif enrichment of peak in a certain genomic region. How can I carry out downstream peakAnnoEnrichment; after I subset markersPeaks? Or directly use markersPeaks for peakAnnoEnrichment before subset?