arvestad / jprime

Probabilistic Inference of Molecular Evolution
Other
8 stars 10 forks source link

Insufficient no. of discretization points #31

Open omwa227 opened 1 year ago

omwa227 commented 1 year ago

I am trying to tun this program with my list of 167 species. The following are the scripts that I am running with sbatch - both have given me the same error message that I will also attach.

JAR=/home/omwa227/scratch/bin/JPrIME/jprime-0.3.7.jar java -Xms256g -Xmx512g -jar jprime-0.3.7.jar Delirious Orthofinder_Species_tree_ITWORKSYAY CLOCK.a2m CLOCK_gene_to_host.txt -sm JTT -o MCMCjtt -lout

AND/OR

JAR=/home/omwa227/scratch/bin/JPrIME/jprime-0.3.7.jar java -Xms512m -Xmx1024m -jar jprime-0.3.7.jar Delirious Orthofinder_Species_tree_ITWORKSYAY CLOCK.a2m CLOCK_gene_to_host.txt -sm JTT -o MCMCjttAH -lout

Can you give me some insight as to what the problem is?

Image of error message: JPrIME_error

omwa227 commented 1 year ago

I was able to fix the above issue by adding "-maxlosses " to the end of my script. However, it gave me a new issue that I cannot seem to resolve:

"ERROR: Insufficient no. of discretization points. Try using denser discretization for 1) top edge, 2) remaining vertices.

Use option -h or --help to show usage. See .info file for more information."

T jprime issue he .info file is attached below.

arvestad commented 1 year ago

Hi,

Both errors are due to lack of discretisation points, which means that the gene tree, when reconciled to the species tree, does not have enough discretised time points to place gene tree nodes on. You have a fairly large datasets that the defaults are not good for.

Have you found the wiki on GitHub? There are recommendations on the page https://github.com/arvestad/jprime/wiki/Using-the-DLRS-model, in particular under the ”Discretization” subject. I suggest you try -dmin 20, and raise the number if needed. Larger values makes for slower computation.

I would also suggest that you start with smaller versions of your dataset, and work your way up. We never had more than 30 species when we worked on this (mostly because it was hard to find large dated species trees back then).

Best regards, Lars

17 aug. 2023 kl. 22:03 skrev omwa227 @.**@.>>:

I was able to fix the above issue by adding "-maxlosses " to the end of my script. However, it gave me a new issue that I cannot seem to resolve:

"ERROR: Insufficient no. of discretization points. Try using denser discretization for 1) top edge, 2) remaining vertices.

Use option -h or --help to show usage. See .info file for more information."

T [jprime issue]https://user-images.githubusercontent.com/137554353/261424312-513ed51b-900b-4b39-9214-b758e300af79.png he .info file is attached below.

— Reply to this email directly, view it on GitHubhttps://github.com/arvestad/jprime/issues/31#issuecomment-1682901319, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAIQWBWSRZOWRKDY6VJESG3XVZ2J7ANCNFSM6AAAAAA3P24TSE. You are receiving this because you are subscribed to this thread.Message ID: @.***>

omwa227 commented 1 year ago

Hi Lars,

Thank you very much for the advice! I will try using -dmin20 at the end of the script to see if this helps. While trying to get this dataset going, I ran another analysis with 10 species, and I did not encounter any issues. Alternatively, I can try running this with 30 species at a time to try to find what I am looking for.

If the -dmin20 works, however, I will be sure to let you know and send you the script I used if you would like to retain this for your records if someone else encounters a similar issue.

arvestad commented 1 year ago

Thanks, that it would be great if you shared your experience.

Please note that having 20 discr. points on a smaller dataset does not guarantee that it works with the larger dataset, since there will be more gene tree nodes to place ”on” the species branches.

Best, Lars

18 aug. 2023 kl. 17:35 skrev omwa227 @.**@.>>:

Hi Lars,

Thank you very much for the advice! I will try using -dmin20 at the end of the script to see if this helps. While trying to get this dataset going, I ran another analysis with 10 species, and I did not encounter any issues. Alternatively, I can try running this with 30 species at a time to try to find what I am looking for.

If the -dmin20 works, however, I will be sure to let you know and send you the script I used if you would like to retain this for your records if someone else encounters a similar issue.

— Reply to this email directly, view it on GitHubhttps://github.com/arvestad/jprime/issues/31#issuecomment-1684092204, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAIQWBUWUELB3LQURB3JS5TXV6DTZANCNFSM6AAAAAA3P24TSE. You are receiving this because you commented.Message ID: @.***>