quadbio / Pando

Multiome GRN inference.
https://quadbio.github.io/Pando/
MIT License
106 stars 21 forks source link

regions parameter in initate_grn() #62

Open sylestiel opened 2 months ago

sylestiel commented 2 months ago

Hi @joschif,

The vignette reads that we can constrain the set of candidate regions by providing a GenomiicRanges object in the regions argument. Does Pando package come with phastConsElements20Mammals.UCSC.mm10 or SCREEN.ccRE.UCSC.mm10 for mouse datasets as part of the package or do they need to be downloaded separately. What is the best format to download them? Bed files?

Is there an added advantage to constraining the regions when running initiate_grn()?

Thanks!

joschif commented 2 months ago

Hi @sylestiel, we only curated regions for human, so the mouse equivalents are not included. If you download them as bed files then it would be relatively easy to convert them to GRanges, but any format that can be convertred to GRanges is fine.

As for why to constrain the regions, we found that ATAC-seq data typically has a ton of spurious peaks that likely are not active regulatory regions. Constraining these regions to a reasonable number helps to fit models faster and more robustly. This can be done using sets of candidate regions from prior knowledge, but also through detection rates and / or correlation to gene expression.

Hope this helps!

sylestiel commented 2 months ago

Hi@joschif, Can you provide the how-to guidelines on formatting the SCREEN.ccRE.UCSC.mm10 bed file for Pando? So also for the mm10.60way.phastCons.bw

PauBadiaM commented 2 months ago

+1 on how to process the evoconv regions for mouse, it would be very helpful to have them available inside pando as the human ones @joschif

joschif commented 2 months ago

Hi all, you can download the phastcons elements for mouse here: https://genome.ucsc.edu/cgi-bin/hgTables ("35 Vert El."). If you read the downloaded bed file into R and convert it to GRanges, Pando should accept it as region constraints (provided the chromosomes are annotated consistently)