nsheff / LOLA

Locus Overlap Analysis: Enrichment of Genomic Ranges
http://code.databio.org/LOLA
70 stars 19 forks source link

question: how to test two region sets for differential enrichment with buildRestrictedUniverse? #39

Open AdrijaK opened 3 years ago

AdrijaK commented 3 years ago

Hi there, Thanks for the useful package, it makes enrichment calculations a breeze.

I am currently trying to find whether two region sets are differentially enriched in certain features. Could you upload an example on how to compare enrichment for testing two region sets for differential enrichment against a reference database?

From the publication:

Furthermore, the buildRestrictedUniverse() function automatically builds a universe based on query sets and can be used to test two region sets for differential enrichment against a reference database.

The only example I find in documentation is from ?redefineUserSets:

dbPath = system.file("extdata", "hg19", package="LOLA")
regionDB = loadRegionDB(dbLocation=dbPath)
data("sample_universe", package="LOLA")
data("sample_input", package="LOLA")

getRegionSet(regionDB, collections="ucsc_example", filenames="vistaEnhancers.bed")
getRegionSet(dbPath, collections="ucsc_example", filenames="vistaEnhancers.bed")
getRegionFile(dbPath, collections="ucsc_example", filenames="vistaEnhancers.bed")

res = runLOLA(userSets, userUniverse, regionDB, cores=1)
locResult = res[2,]
extractEnrichmentOverlaps(locResult, userSets, regionDB)
writeCombinedEnrichment(locResult, "temp_outfolder")

userSetsRedefined = redefineUserSets(userSets, userUniverse)
resRedefined = runLOLA(userSetsRedefined, userUniverse, regionDB, cores=1)

g = plotTopLOLAEnrichments(resRedefined)

However, in this example, despite userSets containing setA and setB, I cant't find any statistics showing whether two sets are differentially enriched in certain features. What is suggested way to approach this? Thanks!

nsheff commented 3 years ago

Hi, thanks for the interest.

Try this:

https://github.com/nsheff/LOLA/blob/8262ff58d54ec4e335f71f6d4ce1d3a796a5bfce/R/LOLA.R#L163-L178

then just proceed as normal with that restricted universe as the universe you provide to LOLA.

AdrijaK commented 3 years ago

Thanks,

Is this the correct way to use it?

# get the example data:
dbPath = system.file("extdata", "hg19", package="LOLA")
regionDB = loadRegionDB(dbLocation=dbPath)
data("sample_universe", package="LOLA")
data("sample_input", package="LOLA")
getRegionSet(regionDB, collections="ucsc_example", filenames="vistaEnhancers.bed")
getRegionSet(dbPath, collections="ucsc_example", filenames="vistaEnhancers.bed")
getRegionFile(dbPath, collections="ucsc_example", filenames="vistaEnhancers.bed")

# convert userSets to GRangesList
userSets = as(userSets, "GRangesList")

# redefine universe for enabling differential enrichment calculations between two test sets stored in userSets
universeRedefined = buildRestrictedUniverse(userSets)

# redefine userSets for a more appropriate statistical enrichment comparison
userSetsRedefined = redefineUserSets(userSets, universeRedefined )

# calculate enrichments with LOLA
resRedefined = runLOLA(userSetsRedefined, universeRedefined, regionDB, cores=1)

Calculating unit set overlaps... Calculating universe set overlaps... Calculating Fisher scores...

resRedefined:

|userSet | dbSet|collection   |   pValueLog| oddsRatio| support| rnkPV| rnkOR| rnkSup| maxRnk| meanRnk|    b|    c|    d|description                      |cellType |tissue |antibody |treatment |dataSource |filename                    |    qValue|  size|
|:-------|-----:|:------------|-----------:|---------:|-------:|-----:|-----:|------:|------:|-------:|----:|----:|----:|:--------------------------------|:--------|:------|:--------|:---------|:----------|:---------------------------|---------:|-----:|
|setA    |     2|ucsc_example | 100.1529890| 3.2929780|    1062|     1|     3|      2|      3|    2.00|  579| 2510| 4507|ucsc_example                     |NA       |NA     |NA       |NA        |NA         |laminB1Lads.bed             | 0.0000000|  1302|
|setB    |     1|ucsc_example |  64.4426074| 6.5896984|    5766|     1|     1|      1|      1|    1.00| 2516|   97|  279|CpG islands from UCSC annotation |NA       |NA     |NA       |NA        |NA         |cpgIslandExt.bed            | 0.0000000| 28691|
|setA    |     4|ucsc_example |  19.0468632| 3.7602194|     151|     2|     1|      3|      3|    2.00|   59| 3421| 5027|ucsc_example                     |NA       |NA     |NA       |NA        |NA         |vistaEnhancers.bed          | 0.0000000|  1339|
|setA    |     5|ucsc_example |  19.0468632| 3.7602194|     151|     2|     1|      3|      3|    2.00|   59| 3421| 5027|ucsc_example                     |NA       |NA     |NA       |NA        |NA         |vistaEnhancers_colNames.bed | 0.0000000|  1340|
|setA    |     3|ucsc_example |   1.1644597| 2.8515119|       8|     4|     4|      5|      5|    4.33|    4| 3564| 5082|ucsc_example                     |NA       |NA     |NA       |NA        |NA         |numtSAssembled.bed          | 0.1369526|    78|
|setB    |     3|ucsc_example |   0.0012998| 0.2378840|       4|     2|     5|      5|      5|    4.00|    8| 5859| 2787|ucsc_example                     |NA       |NA     |NA       |NA        |NA         |numtSAssembled.bed          | 1.0000000|    78|
|setB    |     2|ucsc_example |   0.0000000| 0.4720576|     880|     3|     2|      2|      3|    2.33|  761| 4983| 2034|ucsc_example                     |NA       |NA     |NA       |NA        |NA         |laminB1Lads.bed             | 1.0000000|  1302|
|setB    |     4|ucsc_example |   0.0000000| 0.2836386|      80|     3|     3|      3|      3|    3.00|  130| 5783| 2665|ucsc_example                     |NA       |NA     |NA       |NA        |NA         |vistaEnhancers.bed          | 1.0000000|  1339|
|setB    |     5|ucsc_example |   0.0000000| 0.2836386|      80|     3|     3|      3|      3|    3.00|  130| 5783| 2665|ucsc_example                     |NA       |NA     |NA       |NA        |NA         |vistaEnhancers_colNames.bed | 1.0000000|  1340|
|setA    |     1|ucsc_example |   0.0000000| 0.1769031|    3276|     5|     5|      1|      5|    3.67| 5006|  296|   80|CpG islands from UCSC annotation |NA       |NA     |NA       |NA        |NA         |cpgIslandExt.bed            | 1.0000000| 28691|