wrpearson / fasta36

Git repository for FASTA36 sequence comparison software
Apache License 2.0
115 stars 14 forks source link

E2() not displayed #1

Closed gaboentropy closed 6 years ago

gaboentropy commented 6 years ago

Hi, The fasta36 manual states that with -z 21, 22, ... a second E-value, E2(), is calculated, but no matter what format I try and display with different '-m' options, I cannot see a second e-value anywhere (using -z 21 or -z 22, I haven't tried all of them yet). I'm a bit frustrated because otherwise, the e-values from a -z 1 and a -z 21 are very similar (vary the same as two runs of the same thing), despite the proteins been quite biased in sequence composition, which means I get only the one e-value, and who knows what happens to the second. Best, -Gabo

wrpearson commented 6 years ago

The -z 21 option is part of the testing script, and I am seeing an E2() column in the standard output. Could you check this with the current version?

gaboentropy commented 6 years ago

Hi Bill, No E2() in my stdout. Latest version (36.3.8g). I tried the standard format. I remember I used to see an E2(), but long ago. Now that I want to check it, I don't get it. I tried -z 21 and -z 22.

Mac OSX high sierra. Compiled using the OS X_86_64 makefile. Maybe I should add some option to the compiler?

wrpearson commented 6 years ago

Is it possible that you are not putting the -z 21 before the query_file and library?

All optional arguments must precede the query_file and library

../bin/fasta36 -q -z 21 ../seq/mgstm1.aa a

../bin/fasta36 -q -z 21 ../seq/mgstm1.aa a

FASTA searches a protein or DNA sequence data bank version 36.3.8g Dec, 2017 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448

Query: ../seq/mgstm1.aa 1>>>sp|P10649|GSTM1_MOUSE Glutathione S-transferase Mu 1; GST 1-1; GST class-mu 1; Glutathione S-transferase GT8.7; pmGT10 - 218 aa Library: PIR1 Annotated (rel. 66) 5121825 residues in 13143 sequences

Statistics: Expectation_n fit: rho(ln(x))= 7.6262+/-0.00259; mu= 4.0686+/- 0.136 mean_var=47.1934+/- 8.889, 0's: 4 Z-trim(88.7): 34 B-trim: 0 in 0/54 Lambda= 0.186695 Statistics E2: (shuffled [444]) MLE statistics: Lambda= 0.1683; K=0.00448 statistics sampled from 1826 (1834) to 1826 sequences Algorithm: FASTA (3.8 Nov 2011) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.473), E-opt: 0.2 (0.14), width: 16 Scan time: 0.190

The best scores are: opt bits E(13143) E2() sp|P08010|GSTM2_RAT Glutathione S-transferase Mu ( 218) 1248 343.2 3e-95 3.5e-86 sp|P09211|GSTP1_HUMAN Glutathione S-transferase P ( 210) 356 103.0 6.1e-23 5.1e-21 sp|P04906|GSTP1_RAT Glutathione S-transferase P; ( 210) 344 99.7 5.7e-22 3.8e-20 sp|P00502|GSTA1_RAT Glutathione S-transferase a ( 222) 237 70.9 2.9e-13 2.7e-12 sp|P14942|GSTA4_RAT Glutathione S-transferase alp ( 222) 179 55.3 1.5e-08 4.6e-08 sp|P20432|GSTT1_DROME Glutathione S-transferase 1 ( 209) 100 34.0 0.035 0.026 sp|P11277|SPTB1_HUMAN Spectrin beta chain, ery (2137) 108 34.7 0.21 0.069

../bin/fasta36 -q -z 21 -m 8C ../seq/mgstm1.aa a

../bin/fasta36 -q -z 21 -m 8C ../seq/mgstm1.aa a

FASTA 36.3.8g Dec, 2017

Query: sp|P10649|GSTM1_MOUSE Glutathione S-transferase Mu 1; GST 1-1; GST class-mu 1; Glutathione S-transferase GT8.7; pmGT10 - 218 aa

Database: PIR1 Annotated (rel. 66)

Fields: query id, subject id, % identity, alignment length, mismatches, gap opens, q. start, q. end, s. start, s. end, evalue, bit score, eval2

16 hits found

sp|P10649|GSTM1_MOUSE gi|121719|sp|P08010|GSTM2_RAT 77.06 218 50 0 1 218 1 218 3e-95 343.2 4.6e-91 sp|P10649|GSTM1_MOUSE gi|121746|sp|P09211|GSTP1_HUMAN 32.83 198 133 13 2 208 3 204 6.1e-23 103.0 2.6e-22 sp|P10649|GSTM1_MOUSE gi|121749|sp|P04906|GSTP1_RAT 33.33 198 132 13 2 208 3 204 5.7e-22 99.7 2.2e-21 sp|P10649|GSTM1_MOUSE gi|62822551|sp|P00502|GSTA1_RAT 29.76 205 144 18 4 218 6 218 2.9e-13 70.9 4.2e-13 sp|P10649|GSTM1_MOUSE gi|121714|sp|P14942|GSTA4_RAT 27.98 193 139 18 5 207 7 207 1.5e-08 55.3 1.2e-08

gaboentropy commented 6 years ago

Ah! That was it! So much pain for such a simple fix. Oh! With -m8C is still displays it! Great job Bill. Thanks!

wrpearson commented 6 years ago

Happy to have people using the FASTA programs.

Bill Pearson