blackrim / treePL

Phylogenetic penalized likelihood
https://github.com/blackrim/treePL/wiki
GNU General Public License v3.0
45 stars 19 forks source link

Failed setting feasible start rates/dates after 10 attempts. Aborting. #63

Open Puguang-Zhao opened 6 months ago

Puguang-Zhao commented 6 months ago

Hi Stephen,

We have a problem here, which has been seen in other posts, but we still don't know how to solve it for our own situation. Here are my files and error prompts.

my treefile: TPSa-bootst100-RAxML_bipartitions.txt

my configuration file: TPSa_treefile.txt

In order to facilitate the upload, I changed the suffix name to txt

And this is the treePL log:

outfile: TPSa.dated.tre
set thorough: true (MAY TAKE A WHILE)
setting opt: 1
setting optad: 1
setting optcvad: 5
setting the cv outfile to TPSa.dated_cv.tre
setting the maximum number of threads to 10
finished reading config file
using system clock for random number seed = 1709087874
tiny branch length at thatha@KAF5193752_1. setting to 0.00062460962
tiny branch length at tetsin@KAF8364721_1. setting to 0.00062460962
tiny branch length at tetsin@KAF8408920_1. setting to 0.00062460962
tiny branch length at acoramer@Acora_12G119300_1. setting to 0.00062460962
tiny branch length at acoramer@Acora_01G218700_1. setting to 0.00062460962
tiny branch length at acoramer@Acora_01G218500_1. setting to 0.00062460962
tiny branch length at acocal@ATA9_697_t1_p1. setting to 0.00062460962
tiny branch length at acoramer@Acora_01G221600_1. setting to 0.00062460962
tiny branch length at acoramer@Acora_M009700_1. setting to 0.00062460962
tiny branch length at acoramer@Acora_M009800_1. setting to 0.00062460962
tiny branch length at zizalati@GWHPBFHI019726. setting to 0.00062460962
tiny branch length at zizalati@GWHPBFHI038534. setting to 0.00062460962
tiny branch length at zizapalu@KAG8064347_1. setting to 0.00062460962
tiny branch length at zizapalu@KAG8064341_1. setting to 0.00062460962
tiny branch length at zizapalu@KAG8064354_1. setting to 0.00062460962
tiny branch length at zizapalu@KAG8064348_1. setting to 0.00062460962
tiny branch length at zizapalu@KAG8045380_1. setting to 0.00062460962
tiny branch length at triaes@Traes_5BS_1F3E56595_1. setting to 0.00062460962
tiny branch length at triaes@Traes_5AS_BC8784C1D_1. setting to 0.00062460962
tiny branch length at horvul@5HG0422690_1. setting to 0.00062460962
tiny branch length at zizapalu@KAG8082033_1. setting to 0.00062460962
tiny branch length at zizapalu@KAG8082032_1. setting to 0.00062460962
tiny branch length at zizapalu@KAG8062971_1. setting to 0.00062460962
tiny branch length at chlspi@Cspi04585. setting to 0.00062460962
tiny branch length at phobou@OF07734. setting to 0.00062460962
tiny branch length at phobou@OF17091. setting to 0.00062460962
tiny branch length at phobou@OF00003. setting to 0.00062460962
tiny branch length at phobou@OF17266. setting to 0.00062460962
tiny branch length at cinnkane@CKAN_02756100. setting to 0.00062460962
tiny branch length at cinnkane@CKAN_01723400. setting to 0.00062460962
tiny branch length at menlon@bhGene008020. setting to 0.00062460962
tiny branch length at menlon@bhGene020155. setting to 0.00062460962
tiny branch length at sholep@GKV36171_1. setting to 0.00062460962
tiny branch length at capgra@Cagra_9457s0001_1. setting to 0.00062460962
tiny branch length at caprub@Carubv10007667m. setting to 0.00062460962
tiny branch length at capgra@Cagra_2793s0006_1. setting to 0.00062460962
tiny branch length at tarhas@XP_010536315_1. setting to 0.00062460962
tiny branch length at tarhas@XP_010552108_1. setting to 0.00062460962
tiny branch length at tarhas@XP_010552109_1. setting to 0.00062460962
tiny branch length at tarhas@XP_010552110_1. setting to 0.00062460962
tiny branch length at poptri@Potri_019G045300_1. setting to 0.00062460962
tiny branch length at manesc@Manes_02G193940_1. setting to 0.00062460962
tiny branch length at avecar@GWHPABKE008969. setting to 0.00062460962
tiny branch length at triwil@XP_038697205_1. setting to 0.00062460962
tiny branch length at triwil@XP_038698076_1. setting to 0.00062460962
tiny branch length at triwil@XP_038697203_1. setting to 0.00062460962
tiny branch length at triwil@XP_038698075_1. setting to 0.00062460962
tiny branch length at procyn@KAJ4945401_1. setting to 0.00062460962
tiny branch length at macainte@XP_042478646_1. setting to 0.00062460962
tiny branch length at macainte@XP_042478645_1. setting to 0.00062460962
tiny branch length at macainte@XP_042478009_1. setting to 0.00062460962
tiny branch length at ficmic@GWHPABKV015537_1. setting to 0.00062460962
tiny branch length at carill@KAG2411278_1. setting to 0.00062460962
tiny branch length at salpur@Sapur_15ZG085700_1. setting to 0.00062460962
tiny branch length at salpur@Sapur_15WG103200_1. setting to 0.00062460962
tiny branch length at salpur@Sapur_15WG103000_1. setting to 0.00062460962
tiny branch length at salpur@Sapur_15ZG085500_1. setting to 0.00062460962
tiny branch length at ficmic@GWHPABKV025077_1. setting to 0.00062460962
tiny branch length at maldom@MD03G1214900. setting to 0.00062460962
tiny branch length at cisrot@GWHPBOWK015939. setting to 0.00062460962
tiny branch length at vitvin@VIT_218s0001g05290_1. setting to 0.00062460962
tiny branch length at vitvin@VIT_218s0001g05460_1. setting to 0.00062460962
tiny branch length at vitvin@VIT_218s0001g04110_1. setting to 0.00062460962
tiny branch length at vitvin@VIT_218s0001g05510_1. setting to 0.00062460962
tiny branch length at paeost@Pos_gene19351_mRNA_1. setting to 0.00062460962
tiny branch length at paeost@Pos_gene19352_mRNA_1. setting to 0.00062460962
tiny branch length at paeost@Pos_gene27463_mRNA_1. setting to 0.00062460962
tiny branch length at paeost@Pos_gene67214_mRNA_1. setting to 0.00062460962
tiny branch length at paeost@Pos_gene16440_mRNA_1. setting to 0.00062460962
tiny branch length at pasedu@ZX_09G0003250. setting to 0.00062460962
tiny branch length at pasedu@ZX_09G0003270. setting to 0.00062460962
tiny branch length at carill@KAG2682109_1. setting to 0.00062460962
tiny branch length at carill@KAG2682104_1. setting to 0.00062460962
tiny branch length at carill@KAG2682108_1. setting to 0.00062460962
tiny branch length at fraves@FvH4_5g35712_t1. setting to 0.00062460962
tiny branch length at pruper@Prupe_4G191000_1. setting to 0.00062460962
tiny branch length at corcit@KAF7848639_1. setting to 0.00062460962
tiny branch length at corcit@KAF8021084_1. setting to 0.00062460962
tiny branch length at corcit@KAF8026085_1. setting to 0.00062460962
tiny branch length at corcit@KAF8026086_1. setting to 0.00062460962
tiny branch length at corcit@KAF8028513_1. setting to 0.00062460962
tiny branch length at corcit@KAF8019317_1. setting to 0.00062460962
tiny branch length at corcit@KAF8019318_1. setting to 0.00062460962
tiny branch length at simchi@GWHPAASQ028832. setting to 0.00062460962
tiny branch length at simchi@GWHPAASQ033900. setting to 0.00062460962
tiny branch length at sholep@GKV50434_1. setting to 0.00062460962
tiny branch length at gosrai@Gorai_009G428400_1. setting to 0.00062460962
tiny branch length at goshir@Gohir_A04G023100_1. setting to 0.00062460962
tiny branch length at goshir@Gohir_A04G023166_1. setting to 0.00062460962
tiny branch length at goshir@Gohir_D13G147000_1. setting to 0.00062460962
tiny branch length at gosrai@Gorai_013G164800_1. setting to 0.00062460962
tiny branch length at gosrai@Gorai_013G164900_1. setting to 0.00062460962
tiny branch length at goshir@Gohir_D11G292100_2. setting to 0.00062460962
tiny branch length at gosrai@Gorai_002G080500_1. setting to 0.00062460962
tiny branch length at gosrai@Gorai_007G315300_1. setting to 0.00062460962
tiny branch length at gosrai@Gorai_007G315100_1. setting to 0.00062460962
tiny branch length at carill@KAG2677982_1. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
tiny branch length at internal node. setting to 0.00062460962
setting Angiosperms_base min: 200.9
setting Eudicots min: 80
setting Monocots min: 124
setting Angiosperms_base max: 245
setting Eudicots max: 120
setting Monocots max: 183
preorder prep
calculating character durations
setting min and max
setting up all constraints
getting feasible start dates
start rate 0.0087380041
numparams:1600
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0093328011
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0091917534
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0097291754
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0092233154
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0089157373
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0091837121
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.01063695
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0105023
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.009210068
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0092979705
initial calc: 1e+15
Failed setting feasible start rates/dates after 10 attempts. Aborting.

In order to avoid the problem caused by too many sequences, I also used another tree file: TPSc.raxml_bs_tree.txt

my configuration file: TPSc_treefile.txt

And this is the treePL log:

outfile: TPSc.dated.tre
set thorough: true (MAY TAKE A WHILE)
setting the maximum number of threads to 10
finished reading config file
using system clock for random number seed = 1709089321
tiny branch length at zizapalu@KAG8071143_1. setting to 0.0029069767
tiny branch length at zizapalu@KAG8071138_1. setting to 0.0029069767
tiny branch length at triaes@Traes_4BS_7DD12F787_1. setting to 0.0029069767
tiny branch length at triaes@Traes_4BS_659C7D2C2_1. setting to 0.0029069767
tiny branch length at zeamay@ZmLH145_K018100_1. setting to 0.0029069767
tiny branch length at spolyrhiza290@Spipo2G0068100. setting to 0.0029069767
tiny branch length at euryfero@GWHPBFHH032581. setting to 0.0029069767
tiny branch length at phypat@Pp3c20_21880V3_1. setting to 0.0029069767
tiny branch length at phypat@Pp3c7_1880V3_1. setting to 0.0029069767
tiny branch length at sphafall@Sphfalx03G011900_1. setting to 0.0029069767
tiny branch length at sphafall@Sphfalx03G014700_1. setting to 0.0029069767
tiny branch length at torgra@TG1g01311_mRNA1. setting to 0.0029069767
tiny branch length at torgra@TG1g01312_mRNA1. setting to 0.0029069767
tiny branch length at aqucoe@Aqcoe4G266100_1. setting to 0.0029069767
tiny branch length at aqucoe@Aqcoe4G266600_1. setting to 0.0029069767
tiny branch length at thatha@KAF5189308_1. setting to 0.0029069767
tiny branch length at tetsin@KAF8401694_1. setting to 0.0029069767
tiny branch length at carill@KAG2693657_1. setting to 0.0029069767
tiny branch length at carill@KAG2693656_1. setting to 0.0029069767
tiny branch length at sholep@GKU86579_1. setting to 0.0029069767
tiny branch length at goshir@Gohir_D08G075200_1. setting to 0.0029069767
tiny branch length at triwil@XP_038697212_1. setting to 0.0029069767
tiny branch length at triwil@XP_038698077_1. setting to 0.0029069767
tiny branch length at triwil@XP_038691829_1. setting to 0.0029069767
tiny branch length at capgra@Cagra_8967s0012_1. setting to 0.0029069767
tiny branch length at caprub@Carubv10003225m. setting to 0.0029069767
tiny branch length at acocal@ATA8_157_t1. setting to 0.0029069767
tiny branch length at acocal@ATA8_10_t1. setting to 0.0029069767
tiny branch length at acocal@ATA8_12_t1. setting to 0.0029069767
tiny branch length at acocal@ATA8_156_t1. setting to 0.0029069767
tiny branch length at zizapalu@KAG8058713_1. setting to 0.0029069767
tiny branch length at zizapalu@KAG8058714_1. setting to 0.0029069767
tiny branch length at internal node. setting to 0.0029069767
tiny branch length at internal node. setting to 0.0029069767
tiny branch length at internal node. setting to 0.0029069767
tiny branch length at internal node. setting to 0.0029069767
tiny branch length at internal node. setting to 0.0029069767
tiny branch length at internal node. setting to 0.0029069767
tiny branch length at internal node. setting to 0.0029069767
tiny branch length at internal node. setting to 0.0029069767
setting Bryophytes min: 420
setting Eudicots min: 120
setting Ferns min: 396.4
setting Gymnosperms min: 300
setting Monocots min: 124
setting Bryophytes max: 520
setting Eudicots max: 100
setting Ferns max: 420.7
setting Gymnosperms max: 340
setting Monocots max: 183
preorder prep
calculating character durations
setting min and max
setting up all constraints
getting feasible start dates
start rate 0.0033390289
numparams:343
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0036162978
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0037816884
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0034654809
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0033296861
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0037392006
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.003661277
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0038417466
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0035898484
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0037175539
initial calc: 1e+15
problem initializing. trying again.
attempting to get feasible start rates/dates.
new start rate 0.0035092963
initial calc: 1e+15
Failed setting feasible start rates/dates after 10 attempts. Aborting.

We guessed that it might be calibration points or tree file problem. Thank you very much. Hope you can see the question and reply to my question.

josephwb commented 6 months ago

This is almost certainly an issue with your constraints. Normally the problem is, e.g., an ancestral node that is constrained to be younger than its descendant. I have code (somewhere!) to check for these kinds of errors; I'll try to find it and post it somewhere useful, as this seems to be a common problem.

In other instances (and yours, I suspect) it has to do with monophyly problems. For instance, in bootstrap replicate trees you cannot guarantee that "clade X" will form an exclusive clade in any given tree. In your case, I suspect that there is a problem (either a typo or a 'rogue taxon') involved with either (or both) Eudicots or Monocots; one species in the wrong clade will mean the analysis cannot even start. If a typo is the culprit, then the solution is simple: make your mrca statements more concise. A given statement is use to identify a single node, so you need only provide 2 tip names (i.e., that share a mrca at the node of interest). Your Eudicot statement has 1152 names!

Also, is your numsites value (1601) correct? That is the exact same number as the number of tip on your tree, which seems suspicious.

Anyway, I will try to find my constraint-checker code and report back :)

josephwb commented 6 months ago

I notice that your tree is unrooted, which might be the entire problem (but still, simplify your mrca statements!). The tip labels are indecipherable to me, so I don't know how it should be rooted. If you could provide this information (either a single tip, or an outgroup clade that you know is present in the tree) then this will speedup the troubleshooting.

josephwb commented 6 months ago

My constraint-checking code requires a correctly-rooted tree, so I am not able to proceed with checking either the clade definitions or a possible incompatibility amongst constraints.

I tried rerooting myself using tips labelled as 'Angiosperms_base', but things still did not run. I suspect that there is an error in either Eudicots or Monocots, as when I comment out the constraint for one of these at a time (say, on this semi-intelligently rooted tree) then things run fine.

TiagoBelintani commented 1 month ago

Isso é quase certamente um problema com suas restrições. Normalmente o problema é, por exemplo, um nó ancestral que é restrito a ser mais jovem que seu descendente. Eu tenho o código (em algum lugar!) para verificar esses tipos de erros ; Tentei encontrá-lo e postá-lo em algum lugar útil, pois esse parece ser um problema comum.

Em outros casos (e no seu, eu suspeito) tem a ver com problemas de monofilia. Por exemplo, em árvores replicadas bootstrap você não pode garantir que o "clado X" formará um clado exclusivo em qualquer árvore dada. No seu caso, eu suspeito que há um problema (um erro de digitação ou um "táxon desonesto") envolvido com (ou ambos) Eudicots ou Monocots; uma espécie no clade errado significa que a análise não pode nem começar. Se um erro de digitação for o culpado, então a solução é simples: torne suas declarações mrca mais concisas. Uma declaração dada é usada para identificar um único nó, então você precisa fornecer apenas 2 nomes de ponta (ou seja, que inclua um mrca no nó de interesse). Sua declaração Eudicot tem 1152 nomes!

Além disso, seu valor numsites (1601) está correto? Esse é exatamente o mesmo número que o número de pontas na sua árvore, o que parece suspeito.

De qualquer forma, tentarei encontrar meu código selecionado de restrições e informarei novamente :)

Hi, can you share the R script with me? I have the same problem.

josephwb commented 1 month ago

@TiagoBelintani I will search for it tomorrow. Please ping me if I take too long to respond ;)

josephwb commented 1 month ago

@TiagoBelintani I found the code. It is currently part of a very specific pipeline, so I will have to edit it to be more generally useful. I hope to post it later today.

josephwb commented 1 month ago

@TiagoBelintani @Puguang-Zhao I've added the (hurriedly-edited) constraint checking R code here. If checks for redundancies (i.e., constraints that apply to the same node; often an issue with defining MRCAs) and invalid constraints (e.g., a descendant node with a larger minimum age than that of one of its calibrated ancestral nodes).

There are example data in that directory, so you will know how to format your own data to run the code. The tree contains node labels (you can view these in FigTree) which match the constraint data so you can follow what is going on / understand why decisions are reached.

The codes only really checks minimum ages (not maximum ages) because these concern issues of consistency (i.e., will treePL run at all?). Maximum ages are a different beast; overlapping maximum ages does not necessarily invalidate things like conflicting minimum ages does. If I can think of an intelligent way to deal with maximum ages, I will update the code, and put new problems in the example data.

I hope this is readable. The pipeline to which the code originally belonged involved a taxonomic database, choosing representative taxa with the most genetic coverage, etc. It seems to run with the barebones example data. If you find something missing, or something you would like added, please let me know.

Puguang-Zhao commented 1 month ago

@josephwb Thank you very much!!