cj-mills / christianjmills

My personal blog
https://christianjmills.com/
Other
3 stars 0 forks source link

posts/arc-a770-testing/part-2/ #37

Open utterances-bot opened 1 year ago

utterances-bot commented 1 year ago

Christian Mills - Testing Intel’s Arc A770 GPU for Deep Learning Pt. 2

This post covers my experience training image classification models with Intel’s PyTorch extension on the Arc A770 GPU.

https://christianjmills.com/posts/arc-a770-testing/part-2/

JorgeLM97 commented 1 year ago

Hello! You have probably have seen this question before but would you recommend this for someone who wants to learn ML? I already have programming experience using C/C++ at work and want to learn more about Deep Learning, my budget is really tight so a 3090/4080/4090 is really not an option, and the other gpus from nvidia have a really small vram

JorgeLM97 commented 1 year ago

Also thanks for sharing this posts! I had been looking forward for a part 2 for a while!

cj-mills commented 1 year ago

Hi @JorgeLM97!

I believe the Arc cards are now valid options for getting started with ML, with some caveats.

  1. Intel's PyTorch extension currently only supports Windows via WSL. As stated in this post, I don't recommend WSL for this use case due to the performance hit and other headaches of using WSL for deep learning projects. If using a native Ubuntu installation is not feasible, this might be a problem.
  2. I'll do my best to make the tutorial for getting started with Intel's PyTorch extension as easy to follow as possible. However, the setup process is more involved than using an Nvidia card.
  3. Until PyTorch has built-in support for Arc GPUs, libraries that build on top of PyTorch might take a while to add support for Intel GPUs. That does not necessarily mean you can't use those libraries, but it might take extra work. One example that comes to mind is the fastai library used in the Practical Deep Learning for Coders course. That course is my go-to recommendation for anyone that wants to get started with deep learning.
  4. I have not had time to do extensive testing, and there may be some use cases or model architectures that are not optimized yet.

If those caveats are not deal-breakers, I would seriously consider the Arc GPUs. I hope to have my "getting started" tutorial up before the weekend (Pacific Time), so you can wait for that to see what's involved.

Danyal-sab commented 1 year ago

Many thanks, I was in doubt for a while to buy this card or not Thanks again

cj-mills commented 1 year ago

For anyone considering purchasing an Arc GPU, check the "Supported Hardware Configurations" section of Intel's Quick Start Guide to see if your computer meets the requirements.

https://www.intel.com/content/www/us/en/support/articles/000091128/graphics.html

cccrick commented 1 year ago

May I know the CPU version you are using on the A770 platform? I'm concerned about the scheduling of the Efficient-cores (E-cores) in 13th Gen Intel® Core™ on Ubuntu.

cj-mills commented 1 year ago

@cccrick I'm still using the same i7-11700K CPU that I used in Part 1. I believe there have been patches for 13th gen CPUs on Linux, but I don't have one to test personally.

Danyal-sab commented 1 year ago

Hi @cccrick, I am using the 13th Gen CPU and the answer is yes, kernel 6.0 and above support the newer CPUs. depending on the version of your Linux, you may need to upgrade the kernel, and then you are going to be fine.

laduran commented 7 months ago

I was able to train a RESNET50 classifier on the CIFAR10 dataset using the Intel Extensions for Pytorch under Windows directly on my ARC A750 GPU. I have an article about setting up the same for CPU training here:

https://medium.com/@louisduran/training-resnet50-using-intel-pytorch-extensions-cpu-def7d412609f

I did the ARC GPU training with the file below which I pulled from Intel's GitHub for the pytorch extension. ./dev/Python/pytorch_tutorial/gpu_single_instance_training_bf16.py

Long story short, the training on the ARC A750 took roughly 3m 15s and CPU training was about 46 minutes! Based on this, I do not recommend doing heavy training on your CPU if you have a supported GPU.