deWitLab / peakC

Peak calling for 4C data
12 stars 7 forks source link

Incorrect number of dimensions? Struggling to run single or combined analysis #4

Closed LaurenceDyer closed 3 years ago

LaurenceDyer commented 4 years ago

Hi there,

Your package seems like it's likely to be very useful for me, but I'm running into a bit of an issue getting some of the basic steps done, sorry.

I have a simple .wig file, both with header and without, and I'm just trying to load it in and run a basic analysis to get an idea of what to expect.

Unfortunately, no matter how I read the data in, I'm always encountering the following error, with either single or combined analysis methods:

peakc <- readqWig(file = "fto.wig", vp.pos = xlim[1], window = 1e6)
peakc_mat <- readMatrix("fto2.wig", vp.pos = xlim[1], window = 1e6)
single.analysis(peakc_mat$data, vp.pos = xlim[1])

Error in data[1, 1] : subscript out of bounds

single.analysis(peakc$data, vp.pos = xlim[1])

Error in data[1, 1] : subscript out of bounds

Any help is greatly appreciated, thanks very much!

deWitLab commented 4 years ago

Hi,

Thanks for using peakC. Can you paste the first 10 lines of your wig files?

Best, Elzo

LaurenceDyer commented 4 years ago

Wow, thanks for the quick response, Elzo, that's really helpful.

93691000 164.6 93692000 120.8 93693000 36.2 93694000 285.2 93695000 388.2 93696000 646.6 93697000 697.2 93698000 748.8 93699000 806.4 93700000 976.8 93701000 731.6

It's binned by 1kb, which is quite unfortunate, but the best that I have (At least until I contact the original authors of the dataset). Is that perhaps what's causing the issue?

deWitLab commented 4 years ago

Is this the first 10 lines or did you leave out the header?

LaurenceDyer commented 4 years ago

It doesn't work with nor without the header, unfortunately. Headers I've tried:

fixedStep chrom=chr8 start= 93691000 step=1000

fixedStep chrom=8 start= 93691000 step=1000

variableStep chrom=chr8

deWitLab commented 4 years ago

Note that you need two header lines:

track type=wiggle_0 name="chr11_32224333_blinds_FALSE_alpha-globin_forward_win_1" description="alpha-globin_forward" visibility=full autoScale=on alwaysZero=on windowingFunction=mean maxHeightPixels=80:80:20 color=0,0,255 priority=10 variableStep chrom=chr8

See also the wiggle specification: https://genome.ucsc.edu/goldenPath/help/wiggle.html and the example data.

LaurenceDyer commented 4 years ago

Hm, I still see the same error using a UCSC track header and wiggle file header simultaneously:

track type=wiggle_0 name="fto" description="fto_4C" visibility=full autoScale=on alwaysZero=on windowingFunction=mean maxHeightPixels=80:80:20 color=0,0,255 priority=10
variableStep chrom=chr8 
93691000    164.6
93692000    120.8
93693000    36.2
93694000    285.2
93695000    388.2
93696000    646.6
93697000    697.2
93698000    748.8
93699000    806.4
93700000    976.8

After input:

> peakc <- readqWig(file = "fto2.wig", vp.pos = xlim[1], window = 1e6)
> single.analysis(peakc$data, vp.pos = xlim[1])
Error in data[1, 1] : subscript out of bounds

I get the same error if I just copy and paste the example data from the UCSC at your link. Is there something else I'm not understanding about the file structure here?

deWitLab commented 4 years ago

Can you run traceback() so that I know where the error is?

deWitLab commented 4 years ago

and str(peakc)?

LaurenceDyer commented 4 years ago

For traceback:

> library(peakC)
> xlim <- c(93690000, 95000000)
> peakc <- readqWig(file = "fto2.wig", vp.pos = xlim[1], window = 1e6)
> single.analysis(peakc$data, vp.pos = xlim[1])
Error in data[1, 1] : subscript out of bounds
> traceback()
3: get.background(data[data[, 1] < vp.pos[1], c(1, 2)], vp.pos[1])
2: get.single.background(data = data, num.exp = 1, vp.pos = vp.pos)
1: single.analysis(peakc$data, vp.pos = xlim[1])

For str(peakc):

> str(peakc)
List of 2
 $ data   : num [1:837, 1:2] 9.4e+07 9.4e+07 9.4e+07 9.4e+07 9.4e+07 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:2] "frag_pos" "frag_score"
 $ quality:List of 4
  ..$ percentage.capture.100kb: num NaN
  ..$ percentage.capture.1Mb  : num 98.9
  ..$ percentage.capture.cis  : num 98.1
  ..$ total.read.cis          : num 692727
deWitLab commented 4 years ago

Your viewpoint is outside your 4C range. Try it with 94345000 as a viewpoint.

LaurenceDyer commented 4 years ago

Yep, that did it - Thanks so much!

Sorry for not understanding it quicker. What exactly is vp.pos supposed to represent? I had assumed it was the start of the location I was planning to visualise. Should it be the TSS of the gene of interest?

deWitLab commented 4 years ago

The viewpoint is the fragment that has the primers (in your previous example you set it to xlim[2].

Run this:

peakc <- readqWig(file = "fto2.wig", vp.pos = 94345000, window = 1e6) single.analysis(peakc$data, vp.pos =94345000 )

LaurenceDyer commented 4 years ago

Thanks again, it's running fine now, sorry to take up even more of your time, but... I assume if the output "res" has no p.value slot then nothing was found to be significant?

deWitLab commented 4 years ago

The analysis with a single analysis cannot calculate a p-value. This is described in our publication.