Closed prashpal closed 4 years ago
Sorry you are having this issue.
It should work by default. What Mac hardware are you using?
I am using a 2016 13 inch MacBook Pro with Intel Iris Graphics 550.
Below are the steps I followed:
If I skip step 3, then unit tests run training on CPU and the tests are passing. But, with MPS forced, the test crashes.
Unfortunately, the GPU acceleration for activity classification (and object detection) requires a discrete GPU, not the Intel Iris chipset. Image classification uses a different framework (via CoreML) to leverage GPU resources.
Probably we should update our documentation to clarify the requirements, especially since they differ across toolkits
Thanks for the clarification. Are there plans to enable activity classification with Intel graphics since MPS can support it?
Some testing with our current MPS implementation using Intel graphics did not reveal performance improvements over our MXNet (CPU only) implementation. We do plan to do some more work on activity classification, so we can certainly revisit this question after we've iterated on the implementation some.
For future reference - _mps_utils.use_mps()
is an internal function, that checks the user config plus relevant hardware availability, not a user facing API.
The APIish way for enforcing GPU usage would be
tc.config.set_num_gpus(1)
Ok thank you @nickjong and @igiloh . Please keep me updated when support with Intel graphics is available. Even if we do not see perf improvement with Intel graphics, it will be good to have the option of using it.
hi @prashpal ,
If you're building TC from source - you can try modifying has_fast_mps_support()
in _mps_utils.py to return always true. If you're on mac OS 10.14+ - it would use the intel GPU.
Yes, I am setting use_mps() = True in _mps_utils.py file to force to use Intel graphics. use_mps() seems to be a check for 2 things - has_fast_mps_support() and _tc_config.get_num_gpus() != 0.
With the above change, the test seems to be going through Intel graphics, but the validation test crashes. So I wanted to check if the tests were verified to work with Intel graphics?
Is my understanding correct?
I have not verified any tests for the Intel graphics MPS code path, since this code path is not currently supported.
Image classification has two phases: feature extraction using a neural network and logistic regression based on the extracted features. The logistic regression currently always runs on CPU. The feature extraction is the same for both training and inference, and always uses CoreML, which should use GPU or CPU, as available.
Activity classification training and inference both use either MPS (on Macs with AMD GPUs) or MXNet (using GPU or CPU, as available).
Thanks for clarifying @nickjong
We should probably just go ahead and use the Intel GPU anyway, since this is less confusing. Need to verify that this works end-to-end though
Ok, thanks for the update.
Hi @nickjong , I wanted to check if we have any updates on this.
Sorry, nothing concrete to report yet, although activity classification is something we're actively investigating now
We currently expect/hope to support Skylake Intel GPUs and later, in June
Hi
Im curious if there is any public documentation on CoreML's device selection heuristic. With the addition of 10.15's CoreML preferredMetalDevice API for MLModelConfig, I imagined it would be possible to force the MTLDevice an MLModel / Vision request runs on.
In my testing with integrated, discrete and eGPU, it appears only the eGPU consistently runs the CoreML model. My CoreML Model is a pipeline model consisting of a Mobilenet classifier with multiple outputs (multi head classifiers attached to a custom feature extractor).
Im curious to understand device selection preference for a few reasons:
a) Id like to ensure my MLModel is fed images CIImages backed by textures local to the device inference will be run on, to limit PCI transfers and keep things local
b) my model is actually fed frames of video, and WWDC '19 / 10.15 introduces VideoToolbox and AVFoundation API's to help force particular video encoders and decoders on specific GPUs.
In theory, if all works well, I should be able to specify the same MTLDevice for video decode, preprocessing, CoreML/Vision inference, and subsequent encoding - keeping all IOSurface backed pixel buffers and textures resident on the same GPU.
Apple has a Pro Apps WWDC video suggesting this is the path forward to fast path Multi GPU support / Afterburner decoder support moving forward.
Does CoreML ACTUALLY allow suggested device placement to work?
I am running a retina MacBook Pro 2018 with Vega 20 GPU, and trying various methods to get the Vega 20 to light up.
Disabling automatic graphics switching
Disabling automatic graphics switching / setting NSSupportsAutomaticGraphicsSwitching to False
Disabling automatic graphics switching / setting NSSupportsAutomaticGraphicsSwitching to True
Enabling automatic graphics switching / setting NSSupportsAutomaticGraphicsSwitching to False
Enabling automatic graphics switching / setting NSSupportsAutomaticGraphicsSwitching to True
having a full battery and plugged into my Apple power adaptor
having full battery and plugged into my eGPU
I can only on occasion get the Vega 20 to 'light up' - but can consistently have CoreML run on the eGPU (Radeon 580)
I can inspect the CoreML model and see its MLConfig has a preferred device set to the Vega 20, but Instruments, Xcode, and Activity Monitor all report no GPU usage on the Vega 20, and in fact, sometimes no GPU usage at all (not even the integrated GPU).
Any insight would be most helpful.
Apologies if this is not the best repository to post my query to.
Thanks in advance.
@vade - this isn't the right place to ask this question. I suggest reporting the issue here: https://developer.apple.com/bug-reporting/
I hear you @TobyRoseman - however having these convo's in the open rather than behind closed feedback is helpful to other developers who have similar questions, and leaves a breadcrumb trail to answers. I'm sure you understand!
But yes, ive asked there and on S/O as well. Appreciate the response!
In OS10.14, activity classification works fine with CPU. I tried to force it to use MPS by setting use_mps() as True in the file _mps_utils.py. But the test case is just crashing.
Image classification seems to be using MPS, but I am not sure why activity classification is not.
Has the activity classification been verified to work with MPS? If so, could you share the steps to get it working?