resultRegions() and toBED() function

fulaibaowang commented 6 months ago

Hi,

I am running the vignette, and I will really appreciate if you can explain a bit more about the output.

1 extractRegions As wrote there, extractRegions function to combine the overlapping significant windows. But in the result, you still see overlapping regionns, for example in the vignette:

##  4 chr1           28648620   28648730 +                      5           110
##  5 chr1           28648620   28648733 +                      5           113

Then the real number of signficant binding region shall be less than the total number of row of resultRegions table (218)?

toBED if I do :

resultRegions <- extractRegions(windowRes  = resultWindows,
                            padjCol    = "p_adj_IHW",
                            padjThresh = 0.01, 
                            log2FoldChangeThresh = 0.5) %>% as_tibble

and

toBED(windowRes = resultWindows,
  regionRes = resultRegions,
  fileName  = "enrichedWindowsRegions.bed",                               
   padjCol    = "p_adj_IHW",
   padjThresh = 0.01, 
   log2FoldChangeThresh = 0.5)

the output file "enrichedWindowsRegions.bed" has much more rows than the table resultRegions, why?

Thank you!

sudeepsahadevan commented 6 months ago

Hi

extractRegions extractRegions will only merge co-ordinates from the same gene, and if this particular region has multiple genes annotations, the regions will also occur multiple times

2 toBED will include both enriched regions and the windows corresponding to that region. If you open up a resulting bed file, you can see that there are regions (with tag @region in the name) and windows (without the tag) Hope this helps!

fulaibaowang commented 6 months ago

super helpful! thanks!

fulaibaowang commented 6 months ago

I want to ask another question here :)

so the family-wise corrected windows is corrected for multiple testing with Benjamini-Hochberg in resultsDEWSeq.

And in the vignette you show afterwards, IHW package can be used again for correcting for multiple hypothesis testing.

Can you talk a bit more about this two multiple testing correction and the difference? Is this a bit too stringent here? I am running some of my data and got very few significant hits.

Thank you!

sudeepsahadevan commented 6 months ago

Hi sorry if that was not clear, it is either BH correction or correction using ihw but not both. IHW is a data driven alternative to FWER correction using Benjamini Hochberg: https://bioconductor.org/packages/release/bioc/html/IHW.html

fulaibaowang commented 6 months ago

I got it now, thank you!

EMBL-Hentze-group / DEWSeq

resultRegions() and toBED() function #11