ListerLab / HOME

DMR Identification Tool
33 stars 78 forks source link

Issue 20# Home run doesn’t end #30

Closed Papareddy closed 5 years ago

Papareddy commented 5 years ago

Hi Akanksha,

Just to be sure, is HOME compatable for single replicate timeseries analysis (or pairwise) ?. I have HOME (>0.7) working well for both timeseries or pairewise when i have multi replicates but never ending job run for single replicates as Nuria mentioned(with or without -sin paramether ). Do you think is there something do with single replicate experiments. And i also see double scalar error in these cases. We use the same HPC.

Cheers, Ranj

Akanksha2511 commented 5 years ago

Hi Ranj, yes it does work for single replicate data. In fact, it should be much quicker with single replicate data. It's tested on HPC and works fine. If you can share the files, I will be happy to look into what's causing the issue.

Cheers, Akanksha

Papareddy commented 5 years ago

Here is the header of an example file i am using for your perusal

1 34 - CHG 5 8 1 80 - CHH 2 15 1 84 + CHH 0 4 1 85 + CHH 1 4 1 100 + CHH 1 4 1 101 + CHH 0 4 1 102 + CHH 2 4 1 110 - CG 11 11 1 116 - CG 10 10 1 117 - CHG 2 10 1 121 + CHH 1 4 1 123 + CHG 3 4 1 125 - CHG 7 7 1 126 - CHH 0 5 1 129 - CHH 2 4

I dont see anything peculiar with this.

And the error was basically PBS ending after reaching wall time maxima. for example:

Preparing the DMRs from HOME..... GOOD LUCK ! /net/gmi.oeaw.ac.at/software/mendel/29_04_2013/software/statsmodels/0.6.1-foss-2017a-Python-2.7.13/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-linux-x86_64.egg/statsmodels/stats/weightstats.py:575: RuntimeWarning: invalid value encountered in double_scalars zstat = value / std_diff =>> PBS: job killed: walltime 172825 exceeded limit 172800

Papareddy commented 5 years ago

input files were tab seperated BTW

Akanksha2511 commented 5 years ago

Hi, the input file format seems ok. Which context are you running it for? ..how big are the files? Can you try it on some other system and see how much time it takes to complete? Also, may be good to try on small input files first. If its running fine in case of multiple replicates as you mentioned (on the same system ?), there is no reason why won't it run for the single replicate dataset.

Papareddy commented 5 years ago

I am running for CG and arabidopsis genome (fairly small)..Yeah it was running nicely for timeseries with same conditions for multi rep exeriments.Only change atm was all comparisions are single replicates. Will have to trying some thing.

Cheers, Ranj

Akanksha2511 commented 5 years ago

Yes, try it on the small test case and see if it finishes. Sorry, for the issue but will have to try a few things to resolve it.