a-slide / pycoMeth

DNA methylation analysis downstream to Nanopolish for Oxford Nanopore DNA sequencing datasets
GNU General Public License v3.0
34 stars 2 forks source link

Insufficient effect size / samples #50

Closed JWDebler closed 2 years ago

JWDebler commented 3 years ago

Hi, I know this repo isn't active anymore, but the new repo doesn't have an issue tracker. I was therefore hoping you might know what the problem is here.

I have 2 samples and ran all the steps up to 'Meth_Comp'.

But weather I use the cpg_aggregate or the interval_aggregate files as input, I get:

No valid p-Value could be computed
Results summary
Insufficient effect size: 70,230
Insufficient samples: 2,512 

and

Results summary
Insufficient effect size: 1,098,69
Insufficient samples: 31,312
Valid: 600 
Non-significant pvalue: 304
Significant pvalue: 296

respectively.

Any idea what the problem could be? Cheers.

davezing commented 3 years ago

I am having the same problem, such that the comparison tsv file is full of insufficient samples. Does anyone know how to make it work? Thanks, Dave

chromosome start end n_samples pvalue adj_pvalue neg_med pos_med ambiguous_med unique_cpg_pos labels med_llr_list raw_llr_list raw_pos_list comment chr21 5017086 5017679 1 nan nan 0 0 1 0 [1] [] [] [] Insufficient samples chr21 5019580 5019790 1 nan nan 0 1 0 0 [0] [] [] [] Insufficient samples chr21 5020137 5021627 1 nan nan 1 0 0 0 [0] [] [] [] Insufficient samples chr21 5021977 5023320 1 nan nan 1 0 0 0 [0] [] [] [] Insufficient samples chr21 5026200 5026686 1 nan nan 0 0 1 0 [0] [] [] [] Insufficient samples chr21 5027859 5028457 1 nan nan 0 0 1 0 [0] [] [] [] Insufficient samples chr21 5031978 5032262 1 nan nan 0 0 1 0 [0] [] [] [] Insufficient samples chr21 5033573 5033775 1 nan nan 0 0 1 0 [0] [] [] [] Insufficient samples chr21 5034227 5034658 1 nan nan 0 1 0 0 [0] [] [] [] Insufficient samples chr21 5034851 5035183 1 nan nan 0 0 1 0 [0] [] [] [] Insufficient samples

MaestSi commented 2 years ago

Hi @davezing, you are not probably interested into this anymore, but I noticed there is a parameter for Meth_Comp named m:

-m MAX_MISSING, --max_missing MAX_MISSING
                        Max number of missing samples to perform the test
                        (default: 0) [int]

Since the default is 0, if you have one single sample with few reads (probably the minimum number of reads depends on -d parameter of CpG_Aggregate), the "Insufficient samples" flag is going to be set.

 -d MIN_DEPTH, --min_depth MIN_DEPTH
                        Minimal number of reads covering a site to be reported
                        (default: 10) [int]

By the way, were you working with multiple samples, right?

In case you are still interested, I just wrote a Nextflow pipeline for running the whole pycoMeth workflow, including alignment and methylation detection, across multiple samples.

Best, Simone

a-slide commented 2 years ago

This version is deprecated see Rene Snajder fork at the following URL https://github.com/snajder-r/pycoMeth