FrietzeLabUVM / ssvQC

R package for QC of enrichment based NGS assays. ChIP-seq, cut&run, ATAC-seq, etc.
Other
5 stars 0 forks source link

Error in names(cols) <- group_names #5

Open martinezvbs opened 6 months ago

martinezvbs commented 6 months ago

Hi,

I am new using the tool and I was following the description on the main page using

path <- "/ATAC-PBAF/AK62/Data/QC"
peaks_files <- dir(path, pattern = ".broadPeak", full.names = TRUE)
bw_files <- dir(path, pattern = ".bw$", full.names = TRUE)

Then when I try to run

# Create ssvQC object 
options(mc.cores = 4)
sqc = ssvQC(np_files, bw_files)
sqc = ssvQC.runAll(sqc)

I got the following message

Error in names(cols) <- group_names : 
  'names' attribute [15] must be the same length as the vector [8]

More info about the input files

head (broakPeak)
    3067110 3067542 AK62_Ar_1_REP1.mLb.clN_peak_1   63  .   4.78408 8.36646 6.38689
1   3183757 3184089 AK62_Ar_1_REP1.mLb.clN_peak_2   35  .   3.67095 5.42779 3.56175

> length(peaks_files)
[1] 15

> length(bw_files)
[1] 15

Any advise?

Thanks!

jrboyd commented 5 months ago

Hi, I think I was able to reproduce this error. It seems to come from having too few colors configured for the the number of items.

I just pushed a fix. Reinstall from github and try again please., You should see ssvQC version 1.0.22.

With the code you have, it will reuse the same 8 colors. You'd have more control if you made config objects before creating the ssvQC object. QcConfigFeatures.files and QcConfigSignal.files are the relevant functions.

From the help : ?QcConfigSignal.files

object2 = QcConfigSignal.files(bam_files, sample_names = c("MCF10A_CTCF", "MCF10AT1_CTCF", "MCF10CA1a_CTCF"), group_names = c("10A", "AT1", "CA1"), group_colors = c("firebrick", "slategray2", "forestgreen") )

martinezvbs commented 5 months ago

Hi,

Thanks for the quick response, it is working now (:

One more question, I was running the following code to access to the plots

sqc$plots$signal
sqc$plots$features$venn
sqc$plots$correlation$signal_profile_ggplot_heatmap
  1. The first plots contain the heatmaps (bw) "All signals at 1 to 15" but I am still not sure what "All signal at" means or how it group them together (from 1 to 6 groups)
  2. The second goes from from "All signal at 1 to 15" but in this case using lines (bigwig signal)
  3. Finally I see the "venn diagrams" but I would like to know how to compare between the broadpeaks, so far I just have all the peaks per sample (in a circle individually) but I would like to compare them

Again, thanks for the time!

jrboyd commented 5 months ago

I'm kind of guessing what you're looking at here based on the info you've provided. My guess is that ssvQC is looking at each peak set individually in the context of all signals files for each.

If what you wanted is a comparison between all your peak sets, then give the following code a try:

qcf = QcConfigFeatures.files(np_files, run_separately = FALSE)
sqc = ssvQC(qcf, bw_files)

By going through a config object we have much more control.

Also since you have so many peak sets, a venn diagram is unlikely to helpful, and ssvQC can't create a venn for that many anyway.

I'd recommend:

sqc$plots$features$binary_heatmap$All_features
sqc$plots$features$UpSet$All_features
martinezvbs commented 5 months ago

Hi,

Yes, the recommendation above worked,

thank you so much!