aws-neuron / aws-neuron-sdk

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
https://aws.amazon.com/machine-learning/neuron/
Other
464 stars 154 forks source link

[Brainstorming] Computation Placement on NeuronCore Engines #941

Closed JigaoLuo closed 2 months ago

JigaoLuo commented 3 months ago

Hi,

I’m curious about the possibility of programming different computations on the various engines of NeuronCore, which consists of Tensor, Vector, Scalar, and GPSIMD Engines.

My interest was sparked by reading the Custom Operators API documentation, which mentions that Custom Operators C++code and data are specifically placed on the GPSIMD engine for computation (see in this Figure: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/_images/ncorev2_gpsimd_memory.png). This made me wonder if similar capabilities exist for the other three engines: Tensor, Vector, and Scalar Engines.

If this is possible, at which abstraction level can the placement over engines be programmed? Is it at the ML-framework level, XLA level, HLO-IR level, or even lower at the neuron-compiler level?

(I guess not at ML-framework-level, because computation in frameworks are not fully aware of execution device.)

mrnikwaws commented 3 months ago

Hi @JigaoLuo,

It seems you are asking for a way control the behavior of a model on the different engines of a Neuron core. The current custom operators allow for code to run on chip (on the GPSIMD engine), but cannot be used to control the hardware engines. Is that what you are looking for?

We are aware that our customers would like lower level control of the execution of model segments / kernels, and actively working on a good solution. Please pay attention to upcoming releases.

Similarly, let me know if I am incorrect on the above understanding. Otherwise I'll plan to close this ticket.

JigaoLuo commented 3 months ago

Hi @mrnikwaws, thanks for answering.

I am asking if it is possible to program Tensor, Vector, and Scalar Engines. For example: declare a function kernel to be computed on Tensor Engine of NeuronCore.

mrnikwaws commented 2 months ago

Hi @JigaoLuo,

It is not currently possible to do this, but we are actively working on giving customers finer grained control. I'm going to close this issue - but please keep an eye on upcoming releases.