Device selection with MLDevicePreference and MLPowerPreference

anssiko commented 3 years ago

For follow-up discussion from 2021/05/13 call branching from https://github.com/webmachinelearning/webnn/pull/162:

Add a device preference in the context option in addition to the power preference. This change allows the app to have more control over what type of execution device should be used. It enables an effective interop with WASM (#156).

anssiko commented 2 years ago

Related discussion from our TPAC meeting regarding fingerprinting surface impact of exposing adapter name in WebGPU/GL: https://www.w3.org/2021/10/26-webmachinelearning-minutes.html#t07

The key use case for graphics APIs seems to be to work around GPU driver bugs. That is different from WebNN use case of optimizing the user experience, power, and performance.

If we consider expanding the current WebNN device selection with more devices in the future, we should revisit the latest WebGPU/GL discussion to adapt any best practices from there.

anssiko commented 2 years ago

This issue was revisited in https://www.w3.org/2021/11/18-webmachinelearning-minutes.html#t04.

Summary: A smart automatic device selection would require workload analysis that is costly and even native platform APIs don’t support that currently. The MLPowerPreference hint allows implementations to make more informed decisions on whether to use purpose-built AI accelerators.

I think investigation into power-performance tradeoffs and requirements of key use cases could help figure out what changes if any are needed to the current hints:

enum MLDevicePreference {
  "default",
  "gpu",
  "cpu"
};

enum MLPowerPreference {
  // Let the user agent select the most suitable behavior.
  "default",

  // Prioritizes execution speed over power consumption.
  "high-performance",

  // Prioritizes power consumption over other considerations such as execution speed.
  "low-power"
};

One rough mapping of hints to use cases might be:

”prefer power efficiency / low power”: long-running tasks where energy comsumption is a concern e.g. noise suppression and background segmentation while on a battery-powered device,
”prefer high(est) performance”: heavy tasks that should complete as soon as possible e.g. content creation such as video post-processing with super resolution while plugged in (or not worried of battery running out),
”prefer fast(est) response”: e.g. high-quality real-time communication, VR/AR see motion-to-photon latency,
”prefer to excute when there’s time and energy budget available”: duties that can wait until resources are available to not interfere with competing tasks, e.g. a background task of grouping photos in a library by faces.

Currently we do not have explicit hints for the latter two. Fastest response could be implied by the ”cpu” selection.

Another way looking at a workload as a function of time, we have the following buckets:

"burst" when latency and responsiveness are critical, e.g. process a single photo
"sustained" when performance, battery and thermals are important, e.g. batch processing of a large number of photos
"periodic" when latency and responsiveness AND battery life is important, e.g. noise suppression

To avoid premature optimization, we’d benefit from testing/prototyping with AI accelerators so let’s keep this issue open and collect feedback from implementations.

Meanwhile, feel free to chime in with your suggestions.

anssiko commented 2 years ago

Adding a datapoint to WebCodecs's approach VideoEncoderConfig.hardwareAcceleration:

enum HardwareAcceleration {
  "no-preference",
  "prefer-hardware",
  "prefer-software",
};

anssiko commented 2 years ago

Renamed this issue to be broader than the AI accelerator device selection.

See also https://github.com/webmachinelearning/model-loader/issues/30 for Model Loader API "default" v.s. "auto".

anssiko commented 2 years ago

Looking at other domains for solutions to similar problems, videoElement.canPlayType() answers the question: "Can I play video of type X given codec parameters Y?"

The canPlayType(type) method must return the empty string if type is a type that the user agent knows it cannot render or is the type "application/octet-stream"; it must return "probably" if the user agent is confident that the type represents a media resource that it can render if used in with this audio or video element; and it must return "maybe" otherwise. Implementers are encouraged to return "maybe" unless the type can be confidently established as being supported or not. Generally, a user agent should never return "probably" for a type that allows the codecs parameter if that parameter is not present.

I should note this API may not have the greatest interop story, but it's probably worth looking into for inspiration. There's no perfect solution to this type of a problem AFAICT.

kenchris commented 2 years ago

Attempt to use single words in enums if possible

enum MLPowerPreference {
  "default",
  "high-performance",
  "low-power"
};

Like the above could very well use "performance" and "efficiency" instead. Also, I think it is more common to use "auto" (up to the platform) that "default". "auto" also better reflects that it might change depending on system settings instead of "default" that sounds like a fixed value:

enum MLPowerPreference {
  "auto",
  "performance",
  "efficiency"
};

Given @anssiko comment above, it could add "latency" as well

wchao1115 commented 2 years ago

The device preference has now been changed to a normative device type enum while the MLPowerPreference remains a hint.

kenchris commented 2 years ago

I would still avoid using hyphens in the enums if possible and just use "performance" and "efficiency"

anssiko commented 2 years ago

Thanks @kenchris for paying attention to details.

@wchao1115 @huningxin your thoughts on the naming suggestion https://github.com/webmachinelearning/webnn/issues/169#issuecomment-1145022693?

Personally I don't have a strong preference either way. My general recommendation is to pick names that stand the test of time. Other considerations for naming are intelligibility, consistency, and symmetry. You also want to minimize conceptual weight.

Once the enum values naming discussion has settled, I think this issue can be closed.

kenchris commented 2 years ago

There is a precedence for enum values to mostly use hyphens as some kind of namespacing, like "new-client", "existing-client-retain", "existing-client-navigate" or "landscape", "landscape-primary", "portrait", "portrait-secondary".

I also don't expect we will ever see values like "high-power" and "low-performance". In this case, they are mostly exclusive and opposites, and that can be expressed with single words: "performance" and "efficiency" - even in CPU design we talk about "performance cores" and "efficiency cores" - not "high performance core" and "low power cores", so I think there is a precedence here

huningxin commented 2 years ago

"performance" and "efficiency" sounds good, thanks @kenchris !

However, I suppose the current naming convention comes from Web GPU APIs design, like WebGPU defines GPUPowerPreference

enum GPUPowerPreference {
    "low-power",
    "high-performance"
};

WebGL defines WebGLPowerPreference

enum WebGLPowerPreference { "default", "low-power", "high-performance" };

Should we keep aligned?

kenchris commented 2 years ago

As WebGPU isn't shipped yet, could you file an issue to see if they might consider the change as well? WebGL in many ways is inconsistent with the rest of the web, so I don't think we need to align with that

anssiko commented 2 years ago

@kenchris as the proposer, I’d suggest you’ll do that so the credit goes to the right person and you can follow up as needed. I suspect the WebGL established legacy has influenced WebGPU naming (consistency, symmetry). That said, sometimes it is appropriate to cut ties with the past what comes to API design and start anew.

anssiko commented 2 years ago

The functional and privacy-related aspects of this issue have been addressed by making the device preference MLDeviceType a normative device type enum while the MLPowerPreference remains a hint.

As for the enum naming, MLPowerPreference currently aligns with the WebGPU GPUPowerPreference naming convention. If the WebGPU API changes its naming as was proposed, WebNN will align with it.

Based on my assessment this issue can be closed.

kenchris commented 1 year ago

Maybe instead we want to choose device type given the below

  "auto",
  "throughput",
  "latency",
  "efficiency"
};

webmachinelearning / webnn

Device selection with MLDevicePreference and MLPowerPreference #169