Handle `reference` with length >1

bschilder commented 2 years ago

I see for some functions you have this [[1]], which was the source of the error i posted earlier. #59

Screenshot 2022-05-18 at 12 45 59 (1)

I added a check for this in overlap_stat_plot, but there's still other functions where this may come up.

Screenshot 2022-05-18 at 12 48 08

Can you either:

Run this check at the beginning of EpiCompare, letting users know that only 1 reference can be used for all (or only certain?) steps.
Modify functions that currently can use only 1 reference at a time to use multiple references at once. Or make them into tabs: one tab per reference

bschilder commented 2 years ago

A quick (but inefficient) fix would be running a loop within the EpiCompare::EpiCompare function.

data("hg38_blacklist") # example blacklist
out_list <- lapply(names(ref_list), function(nm){
  message("\n","======>> ",nm," <<======")
  save_dir <- here::here("reports",nm)
  dir.create(save_dir, showWarnings = FALSE, recursive = TRUE)
#### vvv replace this bit with the actual internal code within `EpiCompare::EpiCompare`
  EpiCompare::EpiCompare(peakfiles = peakfiles, 
                         picard_files = picardfiles,
                         blacklist = hg38_blacklist,
                         genome_build = list(peakfiles="hg38",
                                             reference="hg38",
                                             blacklist="hg38"),
                         genome_build_output = "hg38",
                         reference = ref_list[nm],
                         upset_plot = TRUE,
                         stat_plot = TRUE,
                         chromHMM_plot = TRUE,
                         chromHMM_annotation = "K562",
                         chipseeker_plot = TRUE,
                         enrichment_plot = TRUE,
                         tss_plot = TRUE,
                         save_output = TRUE, 
                         output_dir = save_dir)
})

Note, I'm deliberately not using parallelization at this level. That's because some of the internal steps are already parallelized, and parallelizing at multiple levels at once (without very careful and more complex coding) can cause errors because the same core is being asked to do multiple things at once (causing a crash).

serachoi1230 commented 2 years ago

I just implemented the option where EpiCompare outputs separate report for each reference and documented this in the manual. Not the most efficient way of doing it so maybe we should consider changing it later at some point

bschilder commented 2 years ago

Awesome, thanks!

neurogenomics / EpiCompare

Handle `reference` with length >1 #71