philippdre / omniCLIP

omniCLIP is a CLIP-Seq peak caller
GNU General Public License v3.0
15 stars 9 forks source link

RuntimeWarning: divide by zero encountered in log #15

Closed CharlotteAnne closed 4 years ago

CharlotteAnne commented 4 years ago

Hi, I ran omniCLIP with command:

omniCLIP.py --annot \
.GenCodeStyle.gff.db \
--genome-dir sep_chrom \
--clip-files WT_SPO_REP1_final.Aligned.out.sorted.DEDUPLICATED.bam \
--clip-files WT_SPO_REP2_final.Aligned.out.sorted.DEDUPLICATED.bam \
--clip-files WT_SPO_REP3_final.Aligned.out.sorted.DEDUPLICATED.bam \
--clip-files WT_SPO_REP4_final.Aligned.out.sorted.DEDUPLICATED.bam \
--bg-files NXL_WT_SPO_REP1_final.Aligned.out.sorted.DEDUPLICATED.bam \
--bg-files NXL_WT_SPO_REP2_final.Aligned.out.sorted.DEDUPLICATED.bam \
--bg-files NXL_WT_SPO_REP3_final.Aligned.out.sorted.DEDUPLICATED.bam \
--out-dir . \
--rev_strand 0 --seed 12345 --nb-cores 10

the run failed with the following output - could you please point me in the right direction how this might have happened? :

setting seed
Loading gene annotation
Loading reads
Parsing the gene annotation
Processing chr01
Processing chr09
Processing chr14
Processing chr13
Processing chr05
Processing chr06
Processing chr08
Processing chr07
Processing chr10
Processing chr04
Processing chr02
Processing chr15
Processing chr16
Processing chr03
Processing chr12
Processing chr11
Saving results
Loading coverage only
Parsing the gene annotation
Processing chr01
Processing chr09
Processing chr14
Processing chr13
Processing chr05
Processing chr06
Processing chr08
Processing chr07
Processing chr10
Processing chr04
Processing chr02
Processing chr15
Processing chr16
Processing chr03
Processing chr12
Processing chr11
Saving results
Masking overlapping positions
Removing genes without CLIP coverage
Initialising the parameters

Iteration: 0
Computing most likely path
Spawning processes
Collecting results
Fitting emission parameters
Fitting emission parameters
Estimating expression parameters
Start estimation of expression parameters
Constructing GLM matrix
Estimating expression parameters: GLM matrix constrution
Fitting GLM
Estimating expression parameters: before fitting
Finished expression parameter estimation
Computing sufficient statistic for fitting md
Getting suffcient statistic
Fitting md distribution
Estimating state 0
Spawning processes
Collecting results
Estimating state 1
Spawning processes
Collecting results
Estimating state 2
Spawning processes
Collecting results
Estimating state 3
Spawning processes
Collecting results
Fitting transition parameters
Fitting transition parameters
Learning transition model
Iterating over genes
.Computing most likely path
Computing most likely path
Spawning processes
Collecting results
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/camp/apps/eb/dev/software/Python/3.8.2-GCCcore-9.3.0/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/camp/apps/eb/dev/software/omniCLIP/20200702-foss-2020a/data_parsing/tools.py", line 1149, in ParallelGetMostLikelyPathForGene
    TransistionProbabilities = np.float64(trans.PredictTransistions(Counts, TransitionParameters, NrOfStates, TransitionType))
  File "/camp/apps/eb/dev/software/omniCLIP/20200702-foss-2020a/stat/trans.py", line 42, in PredictTransistions
    TransistionProb = PredictTransistionsSimple(Counts, TransitionParameters, NrOfStates)
  File "/camp/apps/eb/dev/software/omniCLIP/20200702-foss-2020a/stat/trans.py", line 62, in PredictTransistionsSimple
    TempProb = TransitionParametersLogReg.predict_log_proba(CovMat.T).T
  File "/camp/apps/eb/dev/software/omniCLIP/20200702-foss-2020a/lib/python3.8/site-packages/sklearn/linear_model/_stochastic_gradient.py", line 1096, in _predict_log_proba
    return np.log(self.predict_proba(X))
RuntimeWarning: divide by zero encountered in log
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/camp/apps/eb/dev/software/omniCLIP/20200702-foss-2020a/omniCLIP.py", line 1014, in <module>
    run_omniCLIP(args)
  File "/camp/apps/eb/dev/software/omniCLIP/20200702-foss-2020a/omniCLIP.py", line 352, in run_omniCLIP
    CurrLogLikelihood, IterParameters, First, Paths = PerformIteration(Sequences, Background, IterParameters, NrOfStates, First, Paths, verbosity=EmissionParameters['Verbosity'])
  File "/camp/apps/eb/dev/software/omniCLIP/20200702-foss-2020a/omniCLIP.py", line 692, in PerformIteration
    NewPaths, LogLike = tools.ParallelGetMostLikelyPath(NewPaths, Sequences, Background, EmissionParameters, TransitionParameters, 'nonhomo', verbosity=verbosity)
  File "/camp/apps/eb/dev/software/omniCLIP/20200702-foss-2020a/data_parsing/tools.py", line 1053, in ParallelGetMostLikelyPath
    results = [res for res in results]
  File "/camp/apps/eb/dev/software/omniCLIP/20200702-foss-2020a/data_parsing/tools.py", line 1053, in <listcomp>
    results = [res for res in results]
  File "/camp/apps/eb/dev/software/Python/3.8.2-GCCcore-9.3.0/lib/python3.8/multiprocessing/pool.py", line 865, in next
    raise value
RuntimeWarning: divide by zero encountered in log
philippdre commented 4 years ago

Could you please run the program with pdb (the python debugger) and look if there is something special about the gene where omniClip crahses?

python -m pdb omniCLIP.py --annot \ .GenCodeStyle.gff.db \ --genome-dir sep_chrom \ --clip-files WT_SPO_REP1_final.Aligned.out.sorted.DEDUPLICATED.bam \ --clip-files WT_SPO_REP2_final.Aligned.out.sorted.DEDUPLICATED.bam \ --clip-files WT_SPO_REP3_final.Aligned.out.sorted.DEDUPLICATED.bam \ --clip-files WT_SPO_REP4_final.Aligned.out.sorted.DEDUPLICATED.bam \ --bg-files NXL_WT_SPO_REP1_final.Aligned.out.sorted.DEDUPLICATED.bam \ --bg-files NXL_WT_SPO_REP2_final.Aligned.out.sorted.DEDUPLICATED.bam \ --bg-files NXL_WT_SPO_REP3_final.Aligned.out.sorted.DEDUPLICATED.bam \ --out-dir . \ --rev_strand 0 --seed 12345 --nb-cores 1