DifferentiableUniverseInitiative / IDRIS-hackathon

Repository for hosting material and discussions for the 2021 IDRIS GPU hackathon
MIT License
2 stars 0 forks source link

Adding Getting Started material #1

Closed EiffL closed 3 years ago

EiffL commented 3 years ago

I am starting to add scripts and examples, and document the procedure to get setup on Jean-Zay in this fille https://github.com/DifferentiableUniverseInitiative/IDRIS-hackathon/blob/main/GETTING_STARTED.md

@kimchitsigai feel free to add/make suggestions if you see things that would be useful to document here to help people get started on the machine and/or with horovod

kimchitsigai commented 3 years ago

Myriam just told me that the TF 2.4.1 module with NCCL 2.8 and CUDA 10.2 should be ready by the beginning of next week (a compilation takes 5 hours :-( I’ll go through the installation steps that you’ve described François on Monday. CUDA 11 should be available on JZ by the end of the next week. If there is enough time before Day 1, I’ll go through the installation steps again with CUDA 11. We’ll probably use srun and not horovodrun. But that’s not a big deal to modifiy.

Have a nice week-end!

EiffL commented 3 years ago

Thank you @kimchitsigai! I got a notification from Myriam that the NCCL environmen had finished cooking. I just tried to compile against it and it seems to work nicely :-) I've slightly udpated the instructions in the GETTING_STARTED.

EiffL commented 3 years ago

I think the getting started materials are looking good. Now we just need to figure out how to get profiling ingormation correctly, and we'll cover this in #2