zyndagj / BSMAPz

Updated and optimized fork of BSMAP
Other
22 stars 6 forks source link

methdiff.py error #3

Closed Yoshihiro-handa closed 5 years ago

Yoshihiro-handa commented 7 years ago

Hello, I research Whole-Genome Bisulfite Sequencing in Arabidopsis thaliana.

I have a question about methdiff.py. When I carried out methdiff.py, program was spewed following error.

python methdiff.py -o diff.txt -b 1 -d GCF_000001735.3_TAIR10_genomic.fna WT.txt SI.txt [methdiff] @Tue Jul 18 17:37:52 2017 reading reference file: ../test_1/GCF_000001735.3_TAIR10_genomic_1.fna ... [methdiff] @Tue Jul 18 17:37:54 2017 processing NC_000932.1 ... [methdiff] @Tue Jul 18 17:37:54 2017 processing NC_001284.2 ... Traceback (most recent call last): File "/home/yhanda/software/bsmap-2.90/methdiff.py", line 133, in cmp_chrom(cr) File "/home/yhanda/software/bsmap-2.90/methdiff.py", line 112, in cmp_chrom pval = get_pval(m[0], d[0], m[1], d[1]) File "/home/yhanda/software/bsmap-2.90/methdiff.py", line 86, in get_pval l0, u0 = conf_intv(m0, d0, z0) File "/home/yhanda/software/bsmap-2.90/methdiff.py", line 81, in conf_intv span = z (p (1 - p) / d + z2 / (4 d d)) ** 0.5 ValueError: negative number cannot be raised to a fractional power

I think that this error has been attributed to python version from following site (I use Python 2.7.5). https://stackoverflow.com/questions/17747124/valueerror-negative-number-cannot-be-raised-to-a-fractional-power However, I don’t know how to rewrite the methdiff.py.

Please let me know corrective strategy.

zyndagj commented 7 years ago

Hello, I was not the original author of this software, but I will try to help.

I am fairly certain that this is not a python problem but an issue in the code itself. It looks like a negative number is being raised to the power 1/2, or (more commonly) a square root. This would yield a complex number, and is probably not the intention.

I will try to figure out why this would occur in the next week and get back to you.

Thanks, Greg Zynda

Yoshihiro-handa commented 7 years ago

Thank you very much for your advice. I have carried out without error, if I changed from 0.5 to 1/2.

Thanks, Yoshihiro

paulmenzel commented 7 years ago

@Yoshihiro-handa, please not, that this is likely incorrect. In Python dividing two integers, will result in a truncated integer. In your case:

$ python
Python 2.7.13 (default, May  3 2017, 12:33:43) 
[GCC 5.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 1/2
0
>>> 3/4
0
>>> 1.0/2
0.5

So, taking something to the power of zero always results in 1. You need to fix the program error. For example, hire a programmer to improve bsmap. Your workaround could give you incorrect results.

zyndagj commented 6 years ago

I have a feeling the error occurred because you used a bin size of 1 (-b 1), but I have been unable to reproduce.

Please send me your input files if you would like an official fix because your floating point truncation is not correct.

RubioB commented 5 years ago

Hi, I'm also using methdiff.py after methratio.py and I have the same error than Yoshihiro, did you finally solve the problem ?

Thanks,

Bernadette

zyndagj commented 5 years ago

I'll try to take a look at it.

Similar to what Yoshihiro shared, can you show me how you invoked the program and the exact error you received?

It would also be helpful if I had access to your inputs, or knew how they were generated.

Thanks, Greg

RubioB commented 5 years ago

Thanks for your answer !

This is my script to run methdiff :

!/bin/bash

SBATCH -p workq

SBATCH -J methdiff

SBATCH -c 8

module load system/Python-2.7.2 module load bioinfo/samtools-1.4

python methdiff.py -o out_diff.txt -d S_lycopersicum.fasta out_ratio_WTB.txt out_ratio_WTC.txt

--> Both .txt files were those obtained with methratio.py

And this is the error : [methdiff] @Tue Aug 20 16:42:40 2019 reading reference file: S_lycopersicum.fasta ... [methdiff] @Tue Aug 20 16:42:50 2019 processing SL2.50ch00 ... [methdiff] @Tue Aug 20 16:43:19 2019 processing SL2.50ch01 ... [methdiff] @Tue Aug 20 16:45:23 2019 processing SL2.50ch02 ... [methdiff] @Tue Aug 20 16:46:32 2019 processing SL2.50ch03 ... [methdiff] @Tue Aug 20 16:47:56 2019 processing SL2.50ch04 ... [methdiff] @Tue Aug 20 16:49:24 2019 processing SL2.50ch05 ... [methdiff] @Tue Aug 20 16:50:54 2019 processing SL2.50ch06 ... [methdiff] @Tue Aug 20 16:51:58 2019 processing SL2.50ch07 ... [methdiff] @Tue Aug 20 16:53:29 2019 processing SL2.50ch08 ... [methdiff] @Tue Aug 20 16:54:57 2019 processing SL2.50ch09 ... [methdiff] @Tue Aug 20 16:56:31 2019 processing SL2.50ch10 ... [methdiff] @Tue Aug 20 16:58:01 2019 processing SL2.50ch11 ... [methdiff] @Tue Aug 20 16:59:14 2019 processing SL2.50ch12 ... Traceback (most recent call last): File "methdiff.py", line 133, in cmp_chrom(cr) File "methdiff.py", line 112, in cmp_chrom pval = get_pval(m[0], d[0], m[1], d[1]) File "methdiff.py", line 86, in get_pval l0, u0 = conf_intv(m0, d0, z0) File "methdiff.py", line 81, in conf_intv span = z (p (1 - p) / d + z2 / (4 d d)) ** 0.5 ValueError: negative number cannot be raised to a fractional power

Thanks,

Bernadette


De : Greg Zynda notifications@github.com Envoyé : mardi 20 août 2019 15:37 À : zyndagj/BSMAPz BSMAPz@noreply.github.com Cc : RubioB bernadetterubio@hotmail.com; Comment comment@noreply.github.com Objet : Re: [zyndagj/BSMAPz] methdiff.py error (#3)

I'll try to take a look at it.

Similar to what Yoshihiro shared, can you show me how you invoked the program and the exact error you received?

It would also be helpful if I had access to your inputs, or knew how they were generated.

Thanks, Greg

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/zyndagj/BSMAPz/issues/3?email_source=notifications&email_token=AMV7LR3GDHIDQLR6M6SXF5DQFQFVNA5CNFSM4DUYEFRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4WW76Y#issuecomment-523071483, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMV7LRYIMR7MVC7Y2LHPIZLQFQFVNANCNFSM4DUYEFRA.

zyndagj commented 5 years ago

Thank you for the information. I'll pull some S. lycopersicum data from NCBI and try to reproduce.

RubioB commented 5 years ago

I can send to you my .txt files but maybe I need an email adress beacuse files are very huge (?)


De : Greg Zynda notifications@github.com Envoyé : mardi 20 août 2019 15:45 À : zyndagj/BSMAPz BSMAPz@noreply.github.com Cc : RubioB bernadetterubio@hotmail.com; Comment comment@noreply.github.com Objet : Re: [zyndagj/BSMAPz] methdiff.py error (#3)

Thank you for the information. I'll pull some S. lycopersicum data from NCBI and try to reproduce.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/zyndagj/BSMAPz/issues/3?email_source=notifications&email_token=AMV7LRZAGVQ6BXYXXSHWINDQFQGPZA5CNFSM4DUYEFRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4WXXLI#issuecomment-523074477, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMV7LR6MCONSY6ZYROHFKOTQFQGPZANCNFSM4DUYEFRA.

zyndagj commented 5 years ago

I reproduced your error

(team-py2) wireless-10-146-4-21:methdiff_error gzynda$ python methdiff.py -o out.txt -d S_lycopersicum_chromosomes.2.50.fa out_ratio_WTB.txt out_ratio_WTC.txt 
[methdiff] @Tue Sep  3 13:05:52 2019    reading reference file: S_lycopersicum_chromosomes.2.50.fa ...
[methdiff] @Tue Sep  3 13:06:06 2019    processing SL2.50ch00 ...
[methdiff] @Tue Sep  3 13:06:47 2019    processing SL2.50ch01 ...
[methdiff] @Tue Sep  3 13:09:47 2019    processing SL2.50ch02 ...
[methdiff] @Tue Sep  3 13:11:24 2019    processing SL2.50ch03 ...
[methdiff] @Tue Sep  3 13:13:22 2019    processing SL2.50ch04 ...
[methdiff] @Tue Sep  3 13:15:27 2019    processing SL2.50ch05 ...
[methdiff] @Tue Sep  3 13:17:37 2019    processing SL2.50ch06 ...
[methdiff] @Tue Sep  3 13:19:08 2019    processing SL2.50ch07 ...
[methdiff] @Tue Sep  3 13:21:16 2019    processing SL2.50ch08 ...
[methdiff] @Tue Sep  3 13:23:23 2019    processing SL2.50ch09 ...
[methdiff] @Tue Sep  3 13:25:36 2019    processing SL2.50ch10 ...
[methdiff] @Tue Sep  3 13:27:39 2019    processing SL2.50ch11 ...
[methdiff] @Tue Sep  3 13:29:19 2019    processing SL2.50ch12 ...
Traceback (most recent call last):
  File "methdiff.py", line 133, in <module>
    cmp_chrom(cr)
  File "methdiff.py", line 112, in cmp_chrom
    pval = get_pval(m[0], d[0], m[1], d[1])
  File "methdiff.py", line 86, in get_pval
    l0, u0 = conf_intv(m0, d0, z0)
  File "methdiff.py", line 81, in conf_intv
    span = z * (p * (1 - p) / d + z2 / (4 * d * d)) ** 0.5
ValueError: negative number cannot be raised to a fractional power

and I believe it is because some floating point calculations are being floored to integers during the wilson score calculation.

span = z * (p * (1 - p) / d + z2 / (4 * d * d)) ** 0.5

I'll modify that code and report back.

RubioB commented 5 years ago

Thanks for your answer !

I am therefore waiting for the change !

Bernadette


De : Greg Zynda notifications@github.com Envoyé : mardi 3 septembre 2019 18:52 À : zyndagj/BSMAPz BSMAPz@noreply.github.com Cc : RubioB bernadetterubio@hotmail.com; Comment comment@noreply.github.com Objet : Re: [zyndagj/BSMAPz] methdiff.py error (#3)

I reproduced your error

(team-py2) wireless-10-146-4-21:methdiff_error gzynda$ python methdiff.py -o out.txt -d S_lycopersicum_chromosomes.2.50.fa out_ratio_WTB.txt out_ratio_WTC.txt [methdiff] @Tue Sep 3 13:05:52 2019 reading reference file: S_lycopersicum_chromosomes.2.50.fa ... [methdiff] @Tue Sep 3 13:06:06 2019 processing SL2.50ch00 ... [methdiff] @Tue Sep 3 13:06:47 2019 processing SL2.50ch01 ... [methdiff] @Tue Sep 3 13:09:47 2019 processing SL2.50ch02 ... [methdiff] @Tue Sep 3 13:11:24 2019 processing SL2.50ch03 ... [methdiff] @Tue Sep 3 13:13:22 2019 processing SL2.50ch04 ... [methdiff] @Tue Sep 3 13:15:27 2019 processing SL2.50ch05 ... [methdiff] @Tue Sep 3 13:17:37 2019 processing SL2.50ch06 ... [methdiff] @Tue Sep 3 13:19:08 2019 processing SL2.50ch07 ... [methdiff] @Tue Sep 3 13:21:16 2019 processing SL2.50ch08 ... [methdiff] @Tue Sep 3 13:23:23 2019 processing SL2.50ch09 ... [methdiff] @Tue Sep 3 13:25:36 2019 processing SL2.50ch10 ... [methdiff] @Tue Sep 3 13:27:39 2019 processing SL2.50ch11 ... [methdiff] @Tue Sep 3 13:29:19 2019 processing SL2.50ch12 ... Traceback (most recent call last): File "methdiff.py", line 133, in cmp_chrom(cr) File "methdiff.py", line 112, in cmp_chrom pval = get_pval(m[0], d[0], m[1], d[1]) File "methdiff.py", line 86, in get_pval l0, u0 = conf_intv(m0, d0, z0) File "methdiff.py", line 81, in conf_intv span = z (p (1 - p) / d + z2 / (4 d d)) ** 0.5 ValueError: negative number cannot be raised to a fractional power

and I believe it is because some floating point calculations are being floored to integers during the wilson score calculation.

span = z (p (1 - p) / d + z2 / (4 d d)) ** 0.5

I'll modify that code and report back.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/zyndagj/BSMAPz/issues/3?email_source=notifications&email_token=AMV7LR5LHRMFF2655HT7HHDQH2XATA5CNFSM4DUYEFRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5ZGJ4A#issuecomment-527590640, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMV7LRZS6LQ2SA3ZTON5KMDQH2XATANCNFSM4DUYEFRA.

zyndagj commented 5 years ago

That change did not fix it, so I'm going to figure out what specific data values are causing the error

xiaenhua commented 5 years ago

That change did not fix it, so I'm going to figure out what specific data values are causing the error

Dear zyngaji,

Have you fixed this issue? if so, could send me a copy of the script? (xiaenhua@gmail.com)

Enhua

zyndagj commented 5 years ago

So methdiff.py calculates the methylation frequency from columns

Column Reason Description
6 Effective depth eff_CT = CT * (G / GA)
7 Methylation counts C

The effective depth can be smaller than C, so methratio.py calculates the final methylation frequency with

frequency = min(C, eff_CT) / eff_CT

but methdiff.py has been calculating it as

frequency = C / eff_CT

which was allowing the frequency to go above 1 when eff_CT was reduced below the raw CT values. This then was causing the following calculation

span = z * (p * (1 - p) / d + z2 / (4 * d * d)) ** 0.5

to take the square root of a negative number from the (1-p) part, leading to

ValueError: negative number cannot be raised to a fractional power

I am updating the methdiff.py code to use the same calculating that methratio.py does and will update you when a new release is pushed.

zyndagj commented 5 years ago

The latest release that I just pushed out fixes this issue. Please let me know if you encounter anything else!

https://github.com/zyndagj/BSMAPz/releases/tag/1.1.1

https://anaconda.org/zyndagj/bsmapz