abacus-gene / paml

PAML is a program package for model fitting and phylogenetic tree reconstruction using DNA and protein sequence data. Please report only **technical issues** on this repository (e.g., compiling, programs abort or do not run at all, etc.). Problems with input data and general questions should be posted at https://groups.google.com/g/pamlsoftware?pli
GNU General Public License v3.0
103 stars 19 forks source link

error: edid 253 / 253 patterns (codeml) #40

Closed matthewglasenapp closed 7 months ago

matthewglasenapp commented 8 months ago

Hello, when running codeml (paml v4.10.7) on MacBook Pro M1, the program quits during NSsites Model 8 with the following error:

error: edid 253 / 253 patterns 4:02(base) ~/desktop/paml$

I've included the output from running the codeml command, the control file, and the multiple sequence alignment, and the tree. Any help would be greatly appreciated. err.txt paml_out.txt consensAlign.ordered.phylip.txt consensAlign.ordered.phylip.treefile.txt codeml.ctl.txt

sabifo4 commented 7 months ago

Hi there!

Your tree file seems to have the wrong header as you only have one tree in Newick format, not 1,293 trees. In other words, instead of having 9 1293 in your consensAlign.ordered.phylip.treefile file, you should have 9 1. Nevertheless, I can see that you have the results for model M8 in the paml_out.txt file you attach (you can find them from line 463 onward, and I paste them below too):

NSsites Model 8: beta&w>1 (11 categories)

TREE #  1:  (1, ((2, (4, 5)), (6, ((7, 9), 8))), 3);   MP score: -1
lnL(ntime: 15  np: 23):  -3500.924560      +0.000000
  10..1    10..11   11..12   12..2    12..13   13..4    13..5    11..14   14..6    14..15   15..16   16..7    16..9    15..8    10..3  
 0.046279 0.011169 0.006500 0.052531 0.000630 0.038677 0.054104 0.078572 0.115860 0.229059 0.003703 0.052843 0.061416 0.088236 0.021853 2.040912 1.158380 0.819599 1.442637 0.905585 0.194517 0.258267 3.844699

Note: Branch length is defined as number of nucleotide substitutions per codon (not per nucleotide site).

tree length =  0.861431

(1: 0.046279, ((2: 0.052531, (4: 0.038677, 5: 0.054104): 0.000630): 0.006500, (6: 0.115860, ((7: 0.052843, 9: 0.061416): 0.003703, 8: 0.088236): 0.229059): 0.078572): 0.011169, 3: 0.021853);

(Sdro: 0.046279, ((Sfra: 0.052531, (Sint: 0.038677, Spur: 0.054104): 0.000630): 0.006500, (Hpul: 0.115860, ((Mfra: 0.052843, Pdep: 0.061416): 0.003703, Mnud: 0.088236): 0.229059): 0.078572): 0.011169, Spal: 0.021853);

Detailed output identifying parameters

kappa (ts/tv) =  2.04091

Frequency parameters:
   0.26204 (T)   0.18540 (C)   0.32634 (A)   0.22621 (G)
Parameters in M8 (beta&w>1):
  p0 =   0.90559  p =   0.19452 q =   0.25827
 (p1 =   0.09441) w =   3.84470

MLEs of dN/dS (w) for site classes (K=11)

p:   0.09056  0.09056  0.09056  0.09056  0.09056  0.09056  0.09056  0.09056  0.09056  0.09056  0.09441
w:   0.00000  0.00076  0.01038  0.05682  0.18875  0.43116  0.71245  0.90915  0.98676  0.99981  3.84470

dN & dS for each branch

 branch          t       N       S   dN/dS      dN      dS  N*dN  S*dS

  10..1      0.046   954.0   339.0  0.7520  0.0142  0.0189  13.5   6.4
  10..11     0.011   954.0   339.0  0.7520  0.0034  0.0046   3.3   1.5
  11..12     0.007   954.0   339.0  0.7520  0.0020  0.0027   1.9   0.9
  12..2      0.053   954.0   339.0  0.7520  0.0161  0.0214  15.4   7.3
  12..13     0.001   954.0   339.0  0.7520  0.0002  0.0003   0.2   0.1
  13..4      0.039   954.0   339.0  0.7520  0.0119  0.0158  11.3   5.3
  13..5      0.054   954.0   339.0  0.7520  0.0166  0.0221  15.8   7.5
  11..14     0.079   954.0   339.0  0.7520  0.0241  0.0321  23.0  10.9
  14..6      0.116   954.0   339.0  0.7520  0.0355  0.0473  33.9  16.0
  14..15     0.229   954.0   339.0  0.7520  0.0703  0.0934  67.0  31.7
  15..16     0.004   954.0   339.0  0.7520  0.0011  0.0015   1.1   0.5
  16..7      0.053   954.0   339.0  0.7520  0.0162  0.0216  15.5   7.3
  16..9      0.061   954.0   339.0  0.7520  0.0188  0.0251  18.0   8.5
  15..8      0.088   954.0   339.0  0.7520  0.0271  0.0360  25.8  12.2
  10..3      0.022   954.0   339.0  0.7520  0.0067  0.0089   6.4   3.0

Naive Empirical Bayes (NEB) analysis
Positively selected sites (*: P>95%; **: P>99%)
(amino acids refer to 1st sequence: Sdro)

            Pr(w>1)     post mean +- SE for w

    28 S      0.631         2.761
    42 L      0.899         3.551
   126 S      0.526         2.451
   129 Q      0.904         3.564
   139 K      0.927         3.632
   150 S      0.900         3.555
   205 P      0.934         3.652
   221 S      0.769         3.168
   248 T      0.772         3.177
   249 M      0.875         3.480
   255 M      0.500         2.362
   259 M      0.509         2.389
   279 G      0.551         2.514
   290 P      0.811         3.290
   304 L      0.983*        3.795
   305 E      0.785         3.214
   308 S      0.915         3.598
   310 I      0.998**       3.838
   320 K      0.536         2.483
   406 Q      0.532         2.457
   407 G      0.977*        3.779
   409 I      0.634         2.771
   411 L      0.912         3.588
   414 G      0.805         3.273

Bayes Empirical Bayes (BEB) analysis (Yang, Wong & Nielsen 2005. Mol. Biol. Evol. 22:1107-1118)
Positively selected sites (*: P>95%; **: P>99%)
(amino acids refer to 1st sequence: Sdro)

            Pr(w>1)     post mean +- SE for w

    24 S      0.574         2.154 +- 1.353
    28 S      0.806         2.836 +- 1.192
    42 L      0.955*        3.310 +- 0.942
   126 S      0.744         2.643 +- 1.238
   129 Q      0.955*        3.326 +- 0.972
   139 K      0.968*        3.360 +- 0.927
   142 S      0.520         1.994 +- 1.343
   150 S      0.955*        3.322 +- 0.960
   157 L      0.688         2.493 +- 1.333
   166 V      0.572         2.147 +- 1.355
   167 E      0.591         2.205 +- 1.356
   204 S      0.670         2.440 +- 1.345
   205 P      0.971*        3.375 +- 0.925
   211 D      0.634         2.331 +- 1.350
   221 S      0.883         3.091 +- 1.109
   237 K      0.505         1.951 +- 1.334
   246 G      0.544         2.066 +- 1.352
   248 T      0.885         3.097 +- 1.108
   249 M      0.941         3.278 +- 1.002
   255 M      0.689         2.501 +- 1.339
   259 M      0.696         2.521 +- 1.337
   279 G      0.726         2.614 +- 1.322
   285 Q      0.628         2.313 +- 1.352
   289 P      0.678         2.463 +- 1.341
   290 P      0.906         3.171 +- 1.080
   295 F      0.699         2.515 +- 1.304
   299 Q      0.653         2.387 +- 1.348
   302 T      0.563         2.121 +- 1.352
   303 N      0.599         2.227 +- 1.356
   304 L      0.993**       3.443 +- 0.857
   305 E      0.892         3.123 +- 1.100
   306 A      0.592         2.205 +- 1.353
   308 S      0.962*        3.346 +- 0.945
   310 I      0.999**       3.460 +- 0.841
   316 T      0.564         2.123 +- 1.353
   320 K      0.751         2.662 +- 1.231
   393 G      0.588         2.192 +- 1.353
   404 A      0.646         2.368 +- 1.351
   405 P      0.640         2.351 +- 1.353
   406 Q      0.713         2.572 +- 1.328
   407 G      0.991**       3.437 +- 0.867
   409 I      0.808         2.843 +- 1.191
   411 L      0.959*        3.341 +- 0.967
   414 G      0.902         3.160 +- 1.088
   416 S      0.598         2.225 +- 1.354
   418 Q      0.584         2.183 +- 1.426
   422 M      0.525         2.010 +- 1.344
   423 G      0.657         2.404 +- 1.350
   429 L      0.501         1.943 +- 1.340

The grid 

p0:   0.050  0.150  0.250  0.350  0.450  0.550  0.650  0.750  0.850  0.950
p :   0.100  0.300  0.500  0.700  0.900  1.100  1.300  1.500  1.700  1.900
q :   0.100  0.300  0.500  0.700  0.900  1.100  1.300  1.500  1.700  1.900
ws:   1.500  2.500  3.500  4.500  5.500  6.500  7.500  8.500  9.500 10.500

Posterior on the grid

p0:   0.000  0.000  0.000  0.000  0.000  0.000  0.001  0.131  0.735  0.134
p :   0.073  0.173  0.237  0.244  0.173  0.071  0.020  0.006  0.002  0.001
q :   0.055  0.032  0.061  0.073  0.088  0.109  0.123  0.138  0.153  0.167
ws:   0.001  0.261  0.598  0.072  0.052  0.014  0.002  0.000  0.000  0.000

I ran CODEML on my PC with the header changed and it ran without issues. The printed messages on the terminal when running model 8 are the following:

NSsites Model 8: beta&w>1

TREE #  1
(1, ((2, (4, 5)), (6, ((7, 9), 8))), 3);   MP score: -1
initial w for M8:NSbetaw>1 reset.

ntime & nrate & np:    15     5    23
Qfactor_NS = 0.759931
Out..
lnL  = -3500.924560
2265 lfun, 27180 eigenQcodon, 373725 P(t)

BEBing (dim = 4).  This may take several minutes.
Calculating f(x_h|w): 10 categories 20 w sets.
Calculating f(X), the marginal likelihood.
        log(fX) = -3507.727807  S = -3242.366064  -261.361327
Calculating f(w|X), posterior probabilities of site classes.
        did 253 / 253 patterns   5:10
Time used:  5:10

I believe that you just thought that the did 253 / 253 patterns was an error, but this is just a message printed on the terminal to let you know how many site patterns have been parsed on your input file.

Hope this helps!