stephaneguindon / phyml

PhyML -- Phylogenetic estimation using (Maximum) Likelihood
GNU General Public License v3.0
177 stars 61 forks source link

Unexpected number of bootstrap replicates #183

Closed Ivan-Pchelin closed 1 year ago

Ivan-Pchelin commented 1 year ago

Dear Stephane,

I tried PhyML version 3.3.3:3.3.20211231-1 on Ubuntu 22.04.2. The CPU has 6 physical cores and 12 threads. The problem is that the "-c" option does not work. Every time the program runs the analysis on 6 CPUs and calculates 102 bootstrap replicates instead of 100.

Best wishes, Ivan

Ivan-Pchelin commented 1 year ago

The computer was tested for the ability to use a defined number of threads, up to 12, with vConTACT2 0.11.3.

liamxg commented 1 year ago

@Ivan-Pchelin I never see this issue.

stephaneguindon commented 1 year ago

Dear Stephane,

I tried PhyML version 3.3.3:3.3.20211231-1 on Ubuntu 22.04.2. The CPU has 6 physical cores and 12 threads. The problem is that the "-c" option does not work. Every time the program runs the analysis on 6 CPUs and calculates 102 bootstrap replicates instead of 100.

Best wishes, Ivan

This is the behaviour expected according to our implementation. Indeed, the actual number of bootstrap replicates is the first multiple of the number of CPU cores that is bigger or equal to that asked by the user. Is it an issue?

Ivan-Pchelin commented 1 year ago

@liamxg Thank you for bringing this to my attention. Indeed, I followed the following steps and succeeded to use a specified number of processors.

  1. Download the code from https://github.com/stephaneguindon/phyml
  2. Install the file phyml-mpi as described on the page 7 of the manual. Solve possible problem with mpi.h as described on the page.
  3. Use the command specified at the page 8 of the manual. For example, mpirun -n 4 ./phyml-mpi -i myseq -b 100.
Ivan-Pchelin commented 1 year ago

@stephaneguindon Dear Stephane, thank you for your reply. It may not be a big issue. But probably it is an inconvenience. It adds an extra thing to understand and deal with. I believe not all users have to know everything about the software.

liamxg commented 1 year ago

@stephaneguindon you mean mpi can change the number of replicates?

liamxg commented 1 year ago

@Ivan-Pchelin i can not install mpi version, any help?

liamxg commented 1 year ago

Dear Stephane, I tried PhyML version 3.3.3:3.3.20211231-1 on Ubuntu 22.04.2. The CPU has 6 physical cores and 12 threads. The problem is that the "-c" option does not work. Every time the program runs the analysis on 6 CPUs and calculates 102 bootstrap replicates instead of 100. Best wishes, Ivan

This is the behaviour expected according to our implementation. Indeed, the actual number of bootstrap replicates is the first multiple of the number of CPU cores that is bigger or equal to that asked by the user. Is it an issue?

This is interesting to me.

Ivan-Pchelin commented 1 year ago

@Ivan-Pchelin i can not install mpi version, any help?

@liamxg Do you work with Linux? If so, do you have the packages autoconf automake and pkg-config installed? What is your precise problem? Actually, the manual seems to be very good.

liamxg commented 1 year ago

@Ivan-Pchelin I use Mac.

stephaneguindon commented 1 year ago

On Mac with autotools and open-mpi (or mpich) installed, the following should work (this is pretty much the set of instructions already given in the README file): git clone git@github.com:stephaneguindon/phyml.git; cd phyml/; sh ./autogen.sh; ./configure --enable-phyml-mpi; make;

stephaneguindon commented 1 year ago

@stephaneguindon Dear Stephane, thank you for your reply. It may not be a big issue. But probably it is an inconvenience. It adds an extra thing to understand and deal with. I believe not all users have to know everything about the software.

I agree this is not ideal. However, I could not find any good reason why someone would not want to obtain a slightly larger number of bootstrap replicates for exactly the same time of computation. I should probably amend the manual in order to explain this though.

liamxg commented 1 year ago

On Mac with autotools and open-mpi (or mpich) installed, the following should work (this is pretty much the set of instructions already given in the README file): git clone git@github.com:stephaneguindon/phyml.git; cd phyml/; sh ./autogen.sh; ./configure --enable-phyml-mpi; make;

thanks, I will try this now.

Ivan-Pchelin commented 1 year ago

@stephaneguindon Dear Stephane, thank you for your reply. It may not be a big issue. But probably it is an inconvenience. It adds an extra thing to understand and deal with. I believe not all users have to know everything about the software.

I agree this is not ideal. However, I could not find any good reason why someone would not want to obtain a slightly larger number of bootstrap replicates for exactly the same time of computation. I should probably amend the manual in order to explain this though.

The reason is in the need to disclose the number of performed bootstrap replicates. There would be an immediate question why there were say 102 instead of normal 100 replicates. If one rounds the number of replicates, there is also a difficulty. Also, if I usually take 75% bootstrap as a reliability threshold, what should I think when get 75 out of 102? It adds unnecessary complexity.

liamxg commented 1 year ago

@stephaneguindon @Ivan-Pchelin error

image
liamxg commented 1 year ago

when I use the command make.

stephaneguindon commented 1 year ago

@stephaneguindon @Ivan-Pchelin error

image

You need to install open-mpi or mpich on your computer.

stephaneguindon commented 1 year ago

@stephaneguindon Dear Stephane, thank you for your reply. It may not be a big issue. But probably it is an inconvenience. It adds an extra thing to understand and deal with. I believe not all users have to know everything about the software.

I agree this is not ideal. However, I could not find any good reason why someone would not want to obtain a slightly larger number of bootstrap replicates for exactly the same time of computation. I should probably amend the manual in order to explain this though.

The reason is in the need to disclose the number of performed bootstrap replicates. There would be an immediate question why there were say 102 instead of normal 100 replicates. If one rounds the number of replicates, there is also a difficulty. Also, if I usually take 75% bootstrap as a reliability threshold, what should I think when get 75 out of 102? It adds unnecessary complexity.

Agreed. I'll amend the code accordingly.

liamxg commented 1 year ago

@stephaneguindon Dear Stephane, thank you for your reply. It may not be a big issue. But probably it is an inconvenience. It adds an extra thing to understand and deal with. I believe not all users have to know everything about the software.

I agree this is not ideal. However, I could not find any good reason why someone would not want to obtain a slightly larger number of bootstrap replicates for exactly the same time of computation. I should probably amend the manual in order to explain this though.

The reason is in the need to disclose the number of performed bootstrap replicates. There would be an immediate question why there were say 102 instead of normal 100 replicates. If one rounds the number of replicates, there is also a difficulty. Also, if I usually take 75% bootstrap as a reliability threshold, what should I think when get 75 out of 102? It adds unnecessary complexity.

Agreed. I'll amend the code accordingly.

this is caused by using mpi version, right?

liamxg commented 1 year ago

@stephaneguindon @Ivan-Pchelin error

image

You need to install open-mpi or mpich on your computer.

any link to install open-mpi or mpich?

stephaneguindon commented 1 year ago

brew install openmpi

liamxg commented 1 year ago

@stephaneguindon thanks.

stephaneguindon commented 1 year ago

@stephaneguindon Dear Stephane, thank you for your reply. It may not be a big issue. But probably it is an inconvenience. It adds an extra thing to understand and deal with. I believe not all users have to know everything about the software.

I agree this is not ideal. However, I could not find any good reason why someone would not want to obtain a slightly larger number of bootstrap replicates for exactly the same time of computation. I should probably amend the manual in order to explain this though.

The reason is in the need to disclose the number of performed bootstrap replicates. There would be an immediate question why there were say 102 instead of normal 100 replicates. If one rounds the number of replicates, there is also a difficulty. Also, if I usually take 75% bootstrap as a reliability threshold, what should I think when get 75 out of 102? It adds unnecessary complexity.

stephaneguindon commented 1 year ago

Fixed in #187