facebookresearch / playtorch

PlayTorch is a framework for rapidly creating mobile AI experiences.
https://playtorch.dev/
MIT License
828 stars 102 forks source link

Does this package utilize the Apple Neural Engine (ANE)? #81

Open vgoklani opened 2 years ago

vgoklani commented 2 years ago

Hello,

Thanks for creating this project! I had a few questions around mobile performance, and more specifically, mobile hardware utilization.

  1. Does this package utilize the hardware features of the Apple Neural Engine (ANE)?
  2. Is this running on the mobile GPU or CPU?
  3. How should we monitor CPU/GPU/memory usage on the device?
  4. How would the performance metrics differ, if instead we exported a basic model to ONNX, and then used Core ML? Obviously this would depend on the particular model type, but just trying to get a rough number.

Thanks!

cc: @raedle

raedle commented 2 years ago

@vgoklani, great questions!

  1. PlayTorch uses PyTorch Mobile 1.12 as runtime. Specifically, it uses the Lite Interpreter runtime, which can load TorchScripted models to perform inference. The lite interpreter runtime has limited support for GPU (Vulkan/Metal) and NPU (CoreML/NNAPI)
  2. PlayTorch uses PyTorch Mobile binaries with CPU-only. We are open to exploring GPU and NPU options if TorchScripted models for GPU/NPU are available (currently the selection is very limited, e.g., MobileNet V2)
  3. PlayTorch builds on top of React Native, which itself builds for Android and iOS. If you build your own React Native app with the PlayTorch SDK (react-native-pytorch-core), it's possible to debug and profile the app. Basically, when the app is build in debug mode, any profiler/instrument used for Android/iOS debgging/profiling can be used. Note: if you use the "pre-built PlayTorch app," then debugging and profiling is not supported
  4. Can you elaborate on the particular performance metrics? As mentioned above, PlayTorch uses PyTorch Mobile Lite Interpreter runtime for ML inference. On top of the lite interpreter performance metrics, there is a small overhead for the interop between JavaScript and C++. However, this is very minimal and negligible compared to the inference time. If you are interested in the details, it's recommended to check out JavaScript Interface (JSI), which is used to interface between JavaScript and C++. The jsi.h is a good starting point.

Hope this helps