Closed martinsimon88 closed 7 years ago
Hello,
that is strange. Are you trying the toy problem located at Spearmint/examples/moo/? If not, what is the dimensionality of your problem?
Hello and thanks for the fast response!
I am using the moo example. :)
Best regards, Martin
Hello,
that is strange. Are you trying the toy problem located at Spearmint/examples/moo/? If not, what is the dimensionality of your problem?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/HIPS/Spearmint/issues/103#issuecomment-296559676, or mute the thread https://github.com/notifications/unsubscribe-auth/ALnA16vF-tUk1EVT6w1Tw5fC4j4IgsDAks5rzFH6gaJpZM4NFcjx .
In my laptop it takes 10 seconds to do an iteration, so 50 should not take longer than 10 minutes. Can you check what is the bottle neck in your case. The computations done are:
Step 1 - Run 10 times EP until convergence (one per each sample of the hyper-parameters) Step 2 - Optimize the acquisition.
Ideally you should see on the screen 10 runs of EP until convergence. Each line is an EP iteration. After running EP, the acquisition function is optimized. I do not have NLopt installed, so in my case scipy is used instead.
Hello! I uninstalled NLopt, but still runs ca 1min 30 sec per iteration.
I guess most of the time it spends on Step 1. It prints the following ca 10 times:
Computing PESM for branin_1, branin_2 1: change=1.829144 damping: 0.500000 2: change=2.487438 damping: 0.495000 3: change=0.252376 damping: 0.490050 4: change=0.045401 damping: 0.485149 5: change=0.049822 damping: 0.480298 6: change=0.034507 damping: 0.475495 7: change=0.032269 damping: 0.470740 8: change=0.039005 damping: 0.466033 9: change=0.034097 damping: 0.461372 10: change=0.029572 damping: 0.456759 11: change=0.024536 damping: 0.452191 12: change=0.020833 damping: 0.447669 13: change=0.017341 damping: 0.443192 14: change=0.014655 damping: 0.438761 15: change=0.012268 damping: 0.434373 16: change=0.010359 damping: 0.430029 17: change=0.008712 damping: 0.425729 18: change=0.007362 damping: 0.421472 19: change=0.006215 damping: 0.417257 20: change=0.005261 damping: 0.413084 21: change=0.004455 damping: 0.408953 22: change=0.003779 damping: 0.404864 23: change=0.003209 damping: 0.400815 24: change=0.002729 damping: 0.396807 25: change=0.002323 damping: 0.392839 26: change=0.001980 damping: 0.388911 27: change=0.001690 damping: 0.385022 28: change=0.001444 damping: 0.381171 29: change=0.001235 damping: 0.377360 30: change=0.001058 damping: 0.373586 31: change=0.000907 damping: 0.369850 ....................................................................................................... .......................................................................................................
Ok, my machine is a Intel(R) Core(TM) i7-4710HQ CPU @ 2.50GHz, which could be slightly faster. Are you running spearmint into a virtual machine? That may degrade performance. The only thing I can think of is the library for linear algebra operations. Since the bottleneck of EP is matrix operations, using a better linear algebra library may give you better results in terms of perormance. I have installed openblas. You may try to install openblas and then recompile numpy and scipy so that they use that library for matrix operations. Here there is some info about how to do that:
https://leemendelowitz.github.io/blog/installing-numpy-with-openblas.html
Hello, I have a Intel® Core™ i7-3720QM CPU @ 2.60GHz × 8 and I have my OS installed directly on a SSD HDD. I will try the openblas, but can it be the source of such a big difference? Furthermore, using PES it runs fast.
Martin
On Tue, Apr 25, 2017 at 9:19 AM, danielhernandezlobato < notifications@github.com> wrote:
Ok, my machine is a Intel(R) Core(TM) i7-4710HQ CPU @ 2.50GHz, which could be slightly faster. Are you running spearmint into a virtual machine? That may degrade performance. The only thing I can think of is the library for linear algebra operations. Since the bottleneck of EP is matrix operations, using a better linear algebra library may give you better results in terms of perormance. I have installed openblas. You may try to install openblas and then recompile numpy and scipy so that they use that library for matrix operations. Here there is some info about how to do that:
https://leemendelowitz.github.io/blog/installing-numpy-with-openblas.html
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/HIPS/Spearmint/issues/103#issuecomment-296938657, or mute the thread https://github.com/notifications/unsubscribe-auth/ALnA11eQAlEjM1RLxdaNrUhig4K504-2ks5rzZ6VgaJpZM4NFcjx .
OpenBLAS did not accelerate the speed. Still ca 1-2 mins per iteration. What else might it be?
Thanks, Martin
Ok, I would suggest some tests to check where is the bottleneck.
You can you check how much time does it take EP to converge in the first run (normally there are 10 runs of EP done at each iteration of spearmint). You simply have to edit the file
spearmint/acquisition_functions/predictive_entropy_search_multiobjective.py
go to line 357 to the definition of the "ep" function. Then, introduce these lines before the beginning of the while loop located at line 433
import time start = time.time()
Then, at the end of the while loop (line 495) add these lines with the same indent as line 496.
end = time.time() import pdb; pdb.set_trace()
Run spearmint in the toy problem. It will stop at the breakpoint after a run of EP. There, you can type
end - start
which will give you the time taken. In my case it gives this output:
dani@dani-HP-OMEN:/tmp/Spearmint-PESM/examples/moo$ python /tmp/Spearmint-PESM/spearmint/main.py .
Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
/usr/local/lib/python2.7/dist-packages/IPython/kernel/init.py:13: ShimWarning: The IPython.kernel
package has been deprecated. You should import from ipykernel or jupyter_client instead.
"You should import from ipykernel or jupyter_client instead.", ShimWarning)
Fitting GP to 1 data for branin_1 task...
Fitting GP to 1 data for branin_2 task...
Solving Multi-objective global optimization of posterior means!
Global Optimization finished. Size of the pareto set: 1 opt. 1 observed.
Getting suggestion for branin_1, branin_2...
Computing PESM for branin_1, branin_2 1: change=0.398697 damping: 0.500000 2: change=1.001192 damping: 0.495000 3: change=0.196313 damping: 0.490050 4: change=0.130776 damping: 0.485149 5: change=0.063683 damping: 0.480298 6: change=0.030000 damping: 0.475495 7: change=0.019682 damping: 0.470740 8: change=0.023561 damping: 0.466033 9: change=0.030155 damping: 0.461372 10: change=0.036471 damping: 0.456759 11: change=0.038597 damping: 0.452191 12: change=0.038628 damping: 0.447669 13: change=0.036680 damping: 0.443192 14: change=0.033220 damping: 0.438761 15: change=0.028888 damping: 0.434373 16: change=0.024297 damping: 0.430029 17: change=0.019908 damping: 0.425729 18: change=0.015989 damping: 0.421472 19: change=0.012651 damping: 0.417257 20: change=0.009901 damping: 0.413084 21: change=0.007688 damping: 0.408953 22: change=0.005938 damping: 0.404864 23: change=0.004805 damping: 0.400815 24: change=0.004033 damping: 0.396807 25: change=0.003335 damping: 0.392839 26: change=0.002146 damping: 0.388911 27: change=0.001885 damping: 0.385022 28: change=0.001622 damping: 0.381171 29: change=0.001022 damping: 0.377360 30: change=0.001374 damping: 0.373586 31: change=0.001717 damping: 0.369850 32: change=0.001883 damping: 0.366152 33: change=0.001576 damping: 0.362490 34: change=0.001265 damping: 0.358865 35: change=0.000974 damping: 0.355277
/usr/local/lib/python2.7/dist-packages/spearmint-0.1-py2.7-linux-x86_64.egg/spearmint/acquisition_functions/predictive_entropy_search_multiobjective.py(504)ep() -> return a (Pdb) end - start 0.25279808044433594 (Pdb)
So you can see that EP only takes 0.25 seconds to converge in my laptop. What is obtained in your case?
Hello!
I get 0.55
Martin
On Thu, Apr 27, 2017 at 1:13 PM, danielhernandezlobato < notifications@github.com> wrote:
Ok, I would suggest some tests to check where is the bottleneck.
You can you check how much time does it take EP to converge in the first run (normally there are 10 runs of EP done at each iteration of spearmint). You simply have to edit the file
spearmint/acquisition_functions/predictiveentropy search_multiobjective.py
go to line 357 to the definition of the "ep" function. Then, introduce these lines before the beginning of the while loop located at line 433
import time start = time.time()
Then, at the end of the while loop (line 495) add these lines with the same indent as line 496.
end = time.time() import pdb; pdb.set_trace()
Run spearmint in the toy problem. It will stop at the breakpoint after a run of EP. There, you can type
end - start
which will give you the time taken. In my case it gives this output:
dani@dani-HP-OMEN:/tmp/Spearmint-PESM/examples/moo$ python /tmp/Spearmint-PESM/spearmint/main.py . Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt Generating grammar tables from /usr/lib/python2.7/lib2to3/ PatternGrammar.txt /usr/local/lib/python2.7/dist-packages/IPython/kernel/init.py:13: ShimWarning: The IPython.kernel package has been deprecated. You should import from ipykernel or jupyter_client instead. "You should import from ipykernel or jupyter_client instead.", ShimWarning) Fitting GP to 1 data for branin_1 task... Fitting GP to 1 data for branin_2 task...
Solving Multi-objective global optimization of posterior means!
Global Optimization finished. Size of the pareto set: 1 opt. 1 observed.
Getting suggestion for branin_1, branin_2...
Computing PESM for branin_1, branin_2 1: change=0.398697 damping: 0.500000 2: change=1.001192 damping: 0.495000 3: change=0.196313 damping: 0.490050 4: change=0.130776 damping: 0.485149 5: change=0.063683 damping: 0.480298 6: change=0.030000 damping: 0.475495 7: change=0.019682 damping: 0.470740 8: change=0.023561 damping: 0.466033 9: change=0.030155 damping: 0.461372 10: change=0.036471 damping: 0.456759 11: change=0.038597 damping: 0.452191 12: change=0.038628 damping: 0.447669 13: change=0.036680 damping: 0.443192 14: change=0.033220 damping: 0.438761 15: change=0.028888 damping: 0.434373 16: change=0.024297 damping: 0.430029 17: change=0.019908 damping: 0.425729 18: change=0.015989 damping: 0.421472 19: change=0.012651 damping: 0.417257 20: change=0.009901 damping: 0.413084 21: change=0.007688 damping: 0.408953 22: change=0.005938 damping: 0.404864 23: change=0.004805 damping: 0.400815 24: change=0.004033 damping: 0.396807 25: change=0.003335 damping: 0.392839 26: change=0.002146 damping: 0.388911 27: change=0.001885 damping: 0.385022 28: change=0.001622 damping: 0.381171 29: change=0.001022 damping: 0.377360 30: change=0.001374 damping: 0.373586 31: change=0.001717 damping: 0.369850 32: change=0.001883 damping: 0.366152 33: change=0.001576 damping: 0.362490 34: change=0.001265 damping: 0.358865 35: change=0.000974 damping: 0.355277
/usr/local/lib/python2.7/dist-packages/spearmint-0.1-py2.7- linux-x86_64.egg/spearmint/acquisition_functions/ predictive_entropy_search_multiobjective.py(504)ep() -> return a (Pdb) end - start 0.25279808044433594 (Pdb)
So you can see that EP only takes 0.25 seconds to converge in my laptop. What is obtained in your case?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/HIPS/Spearmint/issues/103#issuecomment-297684783, or mute the thread https://github.com/notifications/unsubscribe-auth/ALnA19V_5_glyZqjQdrei7p7nPSOJRl0ks5r0HhHgaJpZM4NFcjx .
Ok, it is slower but not too much. Can you check now what is the time taken to evaluate the acquisition on the grid of points that is used to solve the optimization problem? You would only need to do something similar. In particular, in the original file (that is, before the changes made before)
spearmint/acquisition_functions/predictive_entropy_search_multiobjective.py
before line 2967 can you add the code
import time start = time.time()
and then, after line 2968 can you add the code:
end = time.time() if cand.shape[ 0 ] > 1: import pdb; pdb.set_trace()
That will stop spearmint after the evaluation of the acquisition on the grid of points and will measure the time taken. In particular, I get this result:
dani@dani-HP-OMEN:/tmp/Spearmint-PESM/examples/moo$ python /tmp/Spearmint-PESM/spearmint/main.py .
Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
/usr/local/lib/python2.7/dist-packages/IPython/kernel/init.py:13: ShimWarning: The IPython.kernel
package has been deprecated. You should import from ipykernel or jupyter_client instead.
"You should import from ipykernel or jupyter_client instead.", ShimWarning)
Getting suggestion for branin_1, branin_2...
Suggestion:
NAME TYPE VALUE
X float 3.803711
Y float 12.583008
Submitted job 1 for tasks(s) branin_1, branin_2 with local scheduler (process id: 4912). Current time: 2017-04-27 14:16:22 Status: 1 pending, 0 complete. ID(s) of pending job(s) for branin_1, branin_2: 1 Waiting for results...
Fitting GP to 1 data for branin_1 task... Fitting GP to 1 data for branin_2 task...
Solving Multi-objective global optimization of posterior means!
Global Optimization finished. Size of the pareto set: 1 opt. 1 observed.
Getting suggestion for branin_1, branin_2...
Computing PESM for branin_1, branin_2 1: change=0.978531 damping: 0.500000 2: change=1.038956 damping: 0.495000 3: change=0.609380 damping: 0.490050 4: change=0.224851 damping: 0.485149 5: change=0.969918 damping: 0.480298 6: change=0.233555 damping: 0.475495 7: change=0.089802 damping: 0.470740 8: change=0.061536 damping: 0.466033 9: change=0.040302 damping: 0.461372 10: change=0.035127 damping: 0.456759 11: change=0.029982 damping: 0.452191 12: change=0.027289 damping: 0.447669 13: change=0.025151 damping: 0.443192 14: change=0.023600 damping: 0.438761 15: change=0.021691 damping: 0.434373 16: change=0.019590 damping: 0.430029 17: change=0.017464 damping: 0.425729 18: change=0.015408 damping: 0.421472 19: change=0.013485 damping: 0.417257 20: change=0.011728 damping: 0.413084 21: change=0.010152 damping: 0.408953 22: change=0.008756 damping: 0.404864 23: change=0.007532 damping: 0.400815 24: change=0.006467 damping: 0.396807 25: change=0.005546 damping: 0.392839 26: change=0.004753 damping: 0.388911 27: change=0.004072 damping: 0.385022 28: change=0.003489 damping: 0.381171 29: change=0.002991 damping: 0.377360 30: change=0.002565 damping: 0.373586 31: change=0.002202 damping: 0.369850 32: change=0.001891 damping: 0.366152 33: change=0.001626 damping: 0.362490 34: change=0.001400 damping: 0.358865 35: change=0.001207 damping: 0.355277 36: change=0.001041 damping: 0.351724 37: change=0.000900 damping: 0.348207 ....................................................................................................... ....................................................................................................... 1.48276294016
/usr/local/lib/python2.7/dist-packages/spearmint-0.1-py2.7-linux-x86_64.egg/spearmint/acquisition_functions/predictive_entropy_search_multiobjective.py(2980)acquisition() -> if tasks is None: (Pdb) end - start 0.3585538864135742 (Pdb)
That is. It takes 0.35 seconds to evaluate the acquisition on the grid of points, of size about 1,000 points.
By the way, there should be a tabulator before import pdb; pdb.set_trace(). I do not why the indent is wrong in the previous post.
I get 0.866 seconds.
Martin
On Thu, Apr 27, 2017 at 2:20 PM, danielhernandezlobato < notifications@github.com> wrote:
By the way, there should be a tabulator before import pdb; pdb.set_trace(). I do not why the indent is wrong in the previous post.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/HIPS/Spearmint/issues/103#issuecomment-297697562, or mute the thread https://github.com/notifications/unsubscribe-auth/ALnA1yMwr3X7NQp9y7K8JSpSK4Zf4foUks5r0IgSgaJpZM4NFcjx .
Ok, can you check the time that it takes to optimize the acquisition function using scipy and LBGS. You only have to add the following lines to the file
spearmint/choosers/default_chooser.py
Before line 820 add
import time start = time.time()
after line 821 add
end = time.time() import pdb; pdb.set_trace()
In my case I get
. . . .
13: change=0.021149 damping: 0.443192 14: change=0.019065 damping: 0.438761 15: change=0.016963 damping: 0.434373 16: change=0.014914 damping: 0.430029 17: change=0.012975 damping: 0.425729 18: change=0.011187 damping: 0.421472 19: change=0.009573 damping: 0.417257 20: change=0.008141 damping: 0.413084 21: change=0.006889 damping: 0.408953 22: change=0.005808 damping: 0.404864 23: change=0.018193 damping: 0.400815 24: change=0.011807 damping: 0.396807 25: change=0.008542 damping: 0.392839 26: change=0.006537 damping: 0.388911 27: change=0.005136 damping: 0.385022 28: change=0.004086 damping: 0.381171 29: change=0.003271 damping: 0.377360 30: change=0.002628 damping: 0.373586 31: change=0.002117 damping: 0.369850 32: change=0.001710 damping: 0.366152 33: change=0.001383 damping: 0.362490 34: change=0.001122 damping: 0.358865 35: change=0.000912 damping: 0.355277 ....................................................................................................... ....................................................................................................... 1.1683031283 Optimizing PESM with L-BFGS (numerically estimating gradients)
/usr/local/lib/python2.7/dist-packages/spearmint-0.1-py2.7-linux-x86_64.egg/spearmint/choosers/default_chooser.py(830)compute_acquisition_function() -> y_opt = -y_opt (Pdb) end - start 2.172369956970215 (Pdb)
That is, it takes 2.17 seconds to optimize the acquisition function to determine the next evaluation point.
I get 3.91
Martin
On Thu, Apr 27, 2017 at 2:34 PM, danielhernandezlobato < notifications@github.com> wrote:
Ok, can you check the time that it takes to optimize the acquisition function using scipy and LBGS. You only have to add the following lines to the file
spearmint/choosers/default_chooser.py
Before line 820 add
import time start = time.time()
after line 821 add
end = time.time() import pdb; pdb.set_trace()
In my case I get
. . . .
13: change=0.021149 damping: 0.443192 14: change=0.019065 damping: 0.438761 15: change=0.016963 damping: 0.434373 16: change=0.014914 damping: 0.430029 17: change=0.012975 damping: 0.425729 18: change=0.011187 damping: 0.421472 19: change=0.009573 damping: 0.417257 20: change=0.008141 damping: 0.413084 21: change=0.006889 damping: 0.408953 22: change=0.005808 damping: 0.404864 23: change=0.018193 damping: 0.400815 24: change=0.011807 damping: 0.396807 25: change=0.008542 damping: 0.392839 26: change=0.006537 damping: 0.388911 27: change=0.005136 damping: 0.385022 28: change=0.004086 damping: 0.381171 29: change=0.003271 damping: 0.377360 30: change=0.002628 damping: 0.373586 31: change=0.002117 damping: 0.369850 32: change=0.001710 damping: 0.366152 33: change=0.001383 damping: 0.362490 34: change=0.001122 damping: 0.358865 35: change=0.000912 damping: 0.355277 ............................................................ ........................................... ............................................................ ........................................... 1.1683031283 Optimizing PESM with L-BFGS (numerically estimating gradients)
/usr/local/lib/python2.7/dist-packages/spearmint-0.1-py2.7- linux-x86_64.egg/spearmint/choosers/default_chooser.py( 830)compute_acquisition_function() -> y_opt = -y_opt (Pdb) end - start 2.172369956970215 (Pdb)
That is, it takes 2.17 seconds to optimize the acquisition function to determine the next evaluation point.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/HIPS/Spearmint/issues/103#issuecomment-297700392, or mute the thread https://github.com/notifications/unsubscribe-auth/ALnA1-RYV7dNgCbRzTRlLsOnCsYDR4EFks5r0ItIgaJpZM4NFcjx .
Ok, for some reason it seems you computer is about two times slower than mine. However, that is strange since your processor seems to be as good as mine. The only reason I can think for that is the use of openblas. May be you installed openblas but did not compile properly numpy and scipy so that in the end openblas is not used in practice.
I think PESM can be slightly more expensive than PESC because for each new point on which the acquisition function has to be evaluated, PESC updates factors that only change the marginals of the GPs (which are independent), and hence no matrix inverses have to be computed. However, in the case of PESM you have to update several factors that introduce correlations among the values of the functions for the new point and the values of the function for the points corresponding to the pareto optimal points (typically there are 50 of those at most). To obtain the new updated marginals after refining the factors, PESM must invert a matrix, which is expensive.
This may explain why PESM is slower than PESC. However, they should not be that different that the running time of one is in the order of minutes and the other is in the order of hours.
Thanks for the explanation!
I think I might try some combination of PESC, NSGA2 and PESM to get the benefits of PESM. Maybe something like on the attached picture.
However, is there a helper function to convert results from e.g .dat file to the MongoDB database and vice versa?
[image: Inline image 1]
On Thu, Apr 27, 2017 at 2:50 PM, danielhernandezlobato < notifications@github.com> wrote:
Ok, for some reason it seems you computer is about two times slower than mine. However, that is strange since your processor seems to be as good as mine. The only reason I can think for that is the use of openblas. May be you installed openblas but did not compile properly numpy and scipy so that in the end openblas is not used in practice.
I think PESM can be slightly more expensive than PESC because for each new point on which the acquisition function has to be evaluated, PESC updates factors that only change the marginals of the GPs (which are independent), and hence no matrix inverses have to be computed. However, in the case of PESM you have to update several factors that introduce correlations among the values of the functions for the new point and the values of the function for the points corresponding to the pareto optimal points (typically there are 50 of those at most). To obtain the new updated marginals after refining the factors, PESM must invert a matrix, which is expensive.
This may explain why PESM is slower than PESC. However, they should not be that different that the running time of one is in the order of minutes and the other is in the order of hours.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/HIPS/Spearmint/issues/103#issuecomment-297704021, or mute the thread https://github.com/notifications/unsubscribe-auth/ALnA1yFZOczCnB9mD-HHHrW50Yt5hcw7ks5r0I8JgaJpZM4NFcjx .
Ok. I think you will have to code that yourself. You can check how spearmint uses mongodb, for example, by looking at the file generate_hypervolumes.py in the example and in main.py.
By the way. I think that by default openblas uses several threads to do the matrix computations. That is useful when working with huge matrices, but deteriorates the performance when the matrices are rather small, because of the overhead of creating and synchronizing threads. You can check if using only a single thread improves the results. You only have to export these variables for that:
export OPENBLAS_NUM_THREADS=1 export GOTO_NUM_THREADS=1 export OMP_NUM_THREADS=1 export OPENBLAS_MAIN_FREE=1
Awesome! It is definitely faster after exporting the variables. It takes my machine ca 25 mins to finish the /moo example case, which matches the differences we compared.
I have experienced something similar in Computational Physics solvers when running problems on one processor is faster. Thanks!
Also, thanks for the tip for generate_hypervolumes.py!
Edit: Do you think the one-processor solution will have a reverse effect when I increase the amount of input variables? or when it runs for e.g 200 iterations?
Hello,
I would like to check if it is something to be expected that if I use PESM, it can take many hours to do 50 suggestions, while PES can do it in ca 10 minutes? or maybe it's one of the dependencies causing it? I had some trouble setting up PyGMO and NLOpt on Ubuntu 16.04 and downgraded my system to Ubuntu 14.04 on which everything seemed to compile fine.
Thanks! Martin