worldmodels / worldmodels.github.io

World Models
Creative Commons Attribution 4.0 International
434 stars 54 forks source link

Question regarding usage of CPU over GPU for CMA-ES #13

Open kessler-frost opened 6 years ago

kessler-frost commented 6 years ago

I searched around for a while but couldn't find an understandable reason for using CPU cores for CMA Evolution Strategy when it is well known that GPUs perform much better when matrices and high levels of parallelism are involved. So, my question is why did you use CPUs instead of GPUs for training the C model when any way you utilised GPUs in V and M models? Pardon if this question seems too naive. Thanks.

worldmodels commented 6 years ago

Hi @kessler-frost

In our experiments, optimizing C involves optimizing for ~ 1000 parameters, so a GPU would not add significant value, while using 64-CPUs to train C in parallel added lots of value. In terms of cost, renting 64-core CPUs on Google Cloud is roughly the same price as renting a single GPU. Training C can take days or weeks depending on how well we want to do so if you want to train C with parallel GPUs the cost will add up.

Meanwhile, GPUs were used to train V and M (on a single GPU virtual machine), in less than a day.

kessler-frost commented 6 years ago

Thank you for the explanation!