sartorlab / methylSig

R package for DNA methylation analysis
17 stars 5 forks source link

Problem with the tile_by_windows() function #48

Closed marin-e closed 4 years ago

marin-e commented 4 years ago

Hi,

I am using MethylSig v1.0.0 on RRBS data. I have a problem with the tile_by_windows() function. I ran tile_by_windows() on this bsseq object :

bs
An object of type 'BSseq' with
  7194960 methylation loci
  9 samples
has not been smoothed
All assays are in-memory

Then I applied:

windowed_bs <- tile_by_windows(bs = bs, win_size = 25)

And I obtained the following bsseq object:

windowed_bs
An object of type 'BSseq' with
  123890520 methylation loci
  9 samples
has not been smoothed
All assays are in-memory

I don't understand why I have more methylation loci after tilling than before.

Do you have an explanation?

Thank you

rcavalcante commented 4 years ago

Hi,

Please see https://github.com/sartorlab/methylSig/issues/47#issuecomment-660079842. Summarizing the salient point:

I usually do the following order of calls:

  1. bsseq::read.bismark()
  2. tile_by_windows()
  3. filter_loci_by_coverage()
  4. filter_loci_by_group_coverage()
  5. diff_methylsig()

All those extra loci are regions with no signal in them. The tiling function is a little "dumb" in the sense that it doesn't filter out non-zero regions for you, so you should run filter_loci_by_group_coverage() after tiling.

Raymond

marin-e commented 4 years ago

Hi,

Thank you for the clarification (sorry, I didn't realize that the topic was the same that #47).

Marine

rcavalcante commented 4 years ago

No problem, I'll go ahead and close the issue. Have a nice day.