Advice on hardware to evolve networks

tracycollins commented 4 years ago

Is your feature request related to a problem? Please describe. Not a feature request, but need advice on HW to run network evolve.

Describe the solution you'd like I'm currently using a variety of Apple machines to evolve networks: an old macPro, another older macPro, a mini, an old mini, etc. They are approaching EOL (and end of support from Apple), and I'm looking for options for the future.

Describe alternatives you've considered I'm guessing generic linux boxes that I'd probably "build/configure" would get me the most bang for the buck, but I've never configured a linux box and I'm not sure what the trade-offs for neural network evolve are in terms of number of cores, RAM, GPU(?), etc.

Should I get 1 big server, or several smaller, less powerful servers? Which brands/models are most reliable? What are you all using?

Any advice appreciated. Thanks!

christianechevarria commented 4 years ago

I'll give you some general considerations of hardware as it relates to ML,

In general, you mostly want a lot of RAM and strong GPUs to do machine learning, because:

Runtime variables are stored in RAM and as your networks get larger that's going to take up more space in RAM
Most machine learning frameworks take advantage of GPUs to run & train networks faster due to the highly parallelized processing ability of GPUs i.e. you can run more calculations at the same time so you don't have to wait for one calculation to finish before doing the next, this really starts to pay off as your networks get wider so that you can batch more of the calculations at once
In the case of carrot which is unique in being an architecture free neuroevolution library: CPU speed is at a premium (for now) because we don't have a defined layer scheme (given that you can add neurons anywhere and the "layers" would constantly be changing) so we currently don't take advantage of GPU processing in the library. This may all change in the future because theoretically you could still do GPU batching by creating a dependency map of the nodes in the network, @luiscarbonell and I had actually devised an algorithm to do so, but this hasn't been tested yet or implemented so for right now CPUs are about what we got. Having beefy RAM pretty much applies anywhere you're going to keep many variables in memory and that's generally the case in ML

Also side note, when shopping for GPUs pay special attention to any potential cache capacity of the hardware, there is a real physical component to where values are stored in the machine and how far programs have to go to get the values they need to operate (you then get into things like bus speed / width - which is a motherboard concern mostly), the idea of caching close to where you do your processing is that you don't need to go out to get a value thus saving time (and energy also) which makes the execution run more efficiently.

Edit: The GPU point of caching also applies to CPUs where you don't want to go through the memory controllers to then hit RAM, whenever you can save physical distance traveling (especially when you have to communicate in series i.e. send a set of bits then send the next set of bits then the next...) your execution will be faster

raimannma commented 4 years ago

My current setup is:

CPU: AMD Ryzen 9 3900X
RAM: 32GB DDR4-3200
GPU: Quadro P2000
SSD: 960GB Corsair Force Series MP510

For carrot this is very good, because we run everything on CPU and my 24 threads are all 100% utilized. Sometimes I had problems with my RAM, but then I used population sizes over 4000.

You should at least have an 8 core CPU, as the creation of web workers needs much time. And with less than 8 cores it may be better to turn off multithreading.

Also want to mention, that there are first experiments with Rust and WebAssembly to get better multithreading.

christianechevarria commented 4 years ago

Should I get 1 big server, or several smaller, less powerful servers? Which brands/models are most reliable? What are you all using?

Any advice appreciated. Thanks!

Ah! And I just realized you had asked for some specific recommendations

This in my opinion is all about cost to performance ratio, if you can get the same performance by creating say a beowulf cluster of inexpensive computers then just do that if you have the desire / time to then that's something I would recommend
As far as GPUs go NVIDIA takes the cake for me personally, they have some very interesting offerings at reasonable price points at least in the US
I'm not doing anything super noteworthy currently I would say, but I am thinking about building a proper setup later

tracycollins commented 4 years ago

Thanks for all the suggestions!

If/when I start this build, I'll let you know how it goes.

On Mon, Aug 10, 2020 at 6:16 PM Christian Echevarria < notifications@github.com> wrote:

Should I get 1 big server, or several smaller, less powerful servers? Which brands/models are most reliable? What are you all using?

Any advice appreciated. Thanks!

Ah! And I just realized you had asked for some specific recommendations

This in my opinion is all about cost to performance ratio, if you can get the same performance by creating say a beowulf cluster https://en.wikipedia.org/wiki/Beowulf_cluster of inexpensive computers then just do that if you have the desire / time to then that's something I would recommend

As far as GPUs go NVIDIA takes the cake for me personally, they have some very interesting offerings at reasonable price points at least in the US

I'm not doing anything super noteworthy currently I would say, but I am thinking about building a proper setup later

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/liquidcarrot/carrot/issues/243#issuecomment-671617215, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAH7PZLC4EMQEPCBUS2ZZMTSABWU3ANCNFSM4PXUISUA .

-- Tracy Collins tc@threeCeeMedia.com threeCeeMedia.com

liquidcarrot / carrot

Advice on hardware to evolve networks #243