microsoft / DirectML

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
MIT License
2.23k stars 298 forks source link

fastai support #171

Open esistgut opened 3 years ago

esistgut commented 3 years ago

The fastai library is the base of the very popular fast.ai courses. The library is built on top of PyTorch. Supporting this library could be pretty useful to a lot of students.

fdwr commented 2 years ago

@Adele101

Adele101 commented 2 years ago

Hi @esistgut,

Thank you for the great request, we will consider this for an upcoming release of pytorch-directml, and will update this thread when more information is available.

xjdeng commented 1 year ago

I've tried to get DirectML working with fastai and it seems to do so without any errors. But for some reason, I don't see my GPU being used and the training speed and behavior are the same as with the CPU.

I've posted about this here but I'll mirror my comments here as well:

Here's how to reproduce on a Windows machine. Yes, I know I'm using pip but DirectML doesn't seem to like Conda for some reason. Yes, you'll need to eventually uninstall the default pytorch and replace it with pytorch-directml.

conda create -n directml-test python=3.8 ipython

conda activate directml-test

pip install fastai torchvision==0.9.0

pip uninstall torch

pip install pytorch-directml

Now open up Python and run the following:

import torch

tensor1 = torch.tensor([1]).to("dml")
tensor2 = torch.tensor([2]).to("dml")
dml_algebra = tensor1 + tensor2
dml_algebra.item()

You shouldn't get an error if your DirectML is set up.

Now run the following:

from fastai.vision.all import *

def catsanddogs(mydevice = "dml"):
    path = untar_data(URLs.PETS)
    files = get_image_files(path/"images")
    def label_func(f): return f[0].isupper()
    defaults.device = torch.device(mydevice)
    dls = ImageDataLoaders.from_name_func(path, files, label_func, item_tfms=Resize(224), num_workers=0)
    learn = cnn_learner(dls, resnet34, metrics=error_rate)
    learn.fine_tune(1)

Invoke the following and open your Windows Task Manager, monitoring your CPU and Memory usage patterns:

catsanddogs('cpu')

Afterwards, run the following to do the exact same as above but on the DirectML GPU:

catsanddogs('gpu')

At least for me, I see the exact same pattern of CPU and Memory usage as well as training speed. It's as if it's using the CPU instead of the DirectML GPU. I've confirmed the exact same behavior on two different computers! I'm not sure if your experience is the same; maybe I'm not using the right commands or what not?