CoderDojoTC / ai-racing-league

Shared code for the AI Racing League
13 stars 6 forks source link

Configure the GPU Server #5

Open dmccreary opened 5 years ago

dmccreary commented 5 years ago

Our sites may not have good Internet access, so we need to be able to set up and configure a local GPU cluster to train our models.

We want to have someone to document setting up a server with GPUs and then allowing students to train their models.

parkererickson commented 5 years ago

Jon and I were talking about this briefly. I'm not sure if the hardware would have enough throughput, but if we had like 8-10 Docker containers up with TensorFlow and Jupyter Notebooks on them, we could have everybody access their own container and go from there.

dmccreary commented 5 years ago

We now have a loaner server that we need to set up and configure. It has two large Nvidia GPUs on it.

We need someone to get TensorFlow and the DonkeyCar software running on it and configure it to allow people to load their images and run the training program. They then need to be able to transfer their files to their car.

See the documentation page here: http://docs.donkeycar.com/guide/train_autopilot/

dmccreary commented 5 years ago

Neal Kelly and got our GPU server working last weekend. The good news is we trained some modes with 10K images in under 5 minutes!! That sever really rocks!

We installed Python, TensorFlow, Jupyter Notebooks and yesterday I finally got the SSH system working. Next, I will set up 10 accounts (one for each car) called arl1, arl2, arl3 and assign each car to one account. Then can then SSH in and train their models. We still need to figure out what type of virtual environment will work.

parkererickson commented 5 years ago

Very cool! I usually just don't use a virtual environment (irresponsible I know) but donkey car uses miniconda. Here are their install instructions: http://docs.donkeycar.com/guide/host_pc/setup_ubuntu/