Open jehugaleahsa opened 1 year ago
Thanks very much, worked flawlessly!
This was unbelievably helpful! <3
Thanks for your detailed explanation. For me, It's like the same procedure I followed for using GPU with Pytorch or tensorflow in Jupyter notebook.
Just one question, How do I make sure the fastai model is using GPU in Jupyter notebook? Is it enough if the print(torch.cuda.is_available())
command returns True
? Should I add tensors to GPU device like we do in Pytorch? If yes. how we do it in fastai?
Cheers!
Prelude
I did exactly what the first chapter of the book says not to do, getting my new Win 11 machine to run the Jupyter notebooks locally. It was quite a journey, but, in retrospect, I can summarize what I did in a few simple steps. It all comes down to finding the right installers up-front and updating the
environment.yml
file, as needed. I wanted to share it here to perhaps motivate some minor improvements to the repo, so cloud-hosting and Ubuntu through WSL aren't deemed absolutely necessary, and local setup isn't seen as super scary or a waste of effort.After all that, I will say I was debating how to share this information. I don't really do blogs anymore, so hopefully this is the right medium and someone else will stumble on this and find it helpful.
Prerequisites
Here's are some things you will need in order to be successful setting up a Windows 10/11 machine:
VS Code & Git for Windows
Start by installing VS Code. Google it, follow the installation instructions. Easy. If it prompts you to install Git for Windows, allow it to guide you through the steps; otherwise, you can install Git for Windows from their website: https://gitforwindows.org/.
Cloning the repository
On this repository's home page, find the green
Code
button and click it. It will show you a URL for the repository. In the directory you want to create the repository, rungit clone <url>
. For example,git clone https://github.com/fastai/fastbook.git
. I actually recommend first creating a fork of the repository and then clone your fork instead. Also Star the repository while you're at it. ;)cd
into the directory that gets created and runcode .
. VS Code should open and you should see a directory full of.ipynb
files. These are the Jupyter notebooks - the rest of this documentation is describing how to get these working locally.NVIDIA CUDA
You'll learn from reading the book that fastai utilizes Pytorch under the hood. In order for Pytorch to utilize your NVIDIA GPU, you need to install some software called CUDA. Getting the project to run locally isn't even worth it if you can't get this working.
Unfortunately, there are a lot of versions of CUDA, and specific versions of Pytorch only work with specific versions of CUDA. While researching this, I found this super useful webpage provided by Pytorch that guides you through finding compatible versions: https://pytorch.org/get-started/locally/
For example, on my local machine, my version of Pytorch was 2.1.0. If you open the
environment.yml
file in the repository, you'll see the only expectation ispytorch>=1.6
, so it could be something completely different in the future.You can install NVIDIA's CUDA software by Googling for
NVIDIA CUDA <version>
. You need to make sure it's one of the versions of CUDA that Pytorch's guide told you to use. The installation involves a hefty 3GB download and takes several minutes. It also wants you to install a bunch of extra crap, like Physx, nsight, drivers, blah, blah, blah. I opted to do a custom install and un-selected everything but the CUDA stuff. After a very long install, you can confirm CUDA is installed by runningnvcc --version
. You might need to close and reopen your terminal for it to find the executable. Make sure the version matches what you need.Anaconda or miniconda
I recommend installing miniconda via scoop (or perhaps chocolatey). The setup for scoop is super simple. Visit the website for more details: https://scoop.sh/
Creating your environment
The guide on pytorch's website told me I needed to use CUDA 11.8 or 12.1. It even provided the following command line:
We're not actually going to run this. Instead, we're going to translate this, adding these dependencies to our
environment.yml
file. For example, this is what myenvironment.yml
looked like afterward:You will see I changed pytorch to be version 2.1.0 explicitly. I also added the dependencies listed in the command line above:
The
-c
arguments were translated into additional "channels".You will want to make similar changes to your
environment.yml
file before proceeding.Conda makes it easy to create "environments". An environment lets you install versions of python and libraries that are independent of other environments. Otherwise, you are installing everything globally and any time a dependency were updated, it would impact every other environment on your machine.
From inside your source directory for this project (or its fork), you should be able to run
conda env create --file .\environment.yml
.This command takes a while to run, but it will install of your dependencies and then
pip install
even more dependencies. Once complete, it will give you the command line to run to activate and deactivate the environment. For example:Testing your environment
Activate your environment. Once the environment in running, you will see the terminal message changes.
Something like:
Open the Python REPL. Take note of the Python version. On my machine, the environment is using 3.11.5. We'll use this information later to make sure Jupyter notebooks is running the correct engine.
Then run
import torch
to make sure it is finding pytorch (this can take several seconds to initialize).Then run
torch.__version__
. This should match the version you used on the Pytorch guide to figure out which version of CUDA to install.Next, run
torch.cuda.is_available()
. If everything was installed correctly, this should eventually returnTrue
.If not, there's probably an incompatibility between your Pytorch and CUDA version, you missed something in your
environment.yml
, or your environment isn't activated correctly.Verifying Jupyter Notebooks
The last step is to make sure Jupyter Notebooks uses your environment correctly. Earlier I mentioned needing to add
nb_conda_kernels
to theenvironment.yml
. This dependency makes sure Jupyter picks up your dependencies.Exit Python (type
exit()
) or open a separate terminal inside your repository directory. Typejupyter notebook .\01_intro.ipynb
. After a few seconds, your browser should open and jupyter notebooks will load.Inside the first code cell (or just insert a new code cell above it), add this code:
Run the cell by typing
SHIFT+ENTER
. You should see the same information printed out that you saw from the Python REPL.Wrapping up
So hopefully this guide worked flawlessly for you on the first try. Otherwise, I hope it gives you some more places to continue debugging your issues.
Overall, there's a lot to learn if your new to Python or coding in general. This exercise helped me to become comfortable with conda and jupyter, that's for sure. Getting CUDA working locally is a huge improvement over the CPU implementation. Afterward, I saw operations completing in seconds that were taking over 10 minutes to complete before. I also saw fewer glitches in the code cells, which makes reading the book more enjoyable and interactive.
As mentioned in the book, if you still can't get the book to run locally on your machine, find an alternative and don't get hung up on it. It's easy to lose focus on what's important, and as these technologies become more familiar and part of your toolbox some of these issues will be easier for you to address in the future.