Closed fdwr closed 5 months ago
FYI @mingmingtasd.
@fdwr thank you for updating the PR. It now reflects the WG's latest discussion.
@huningxin @inexorabletash can you review the latest.
The Fallback topic warrants it own (non-blocking) issue. I'd propose to use @fdwr's overview https://github.com/webmachinelearning/webnn/pull/696#discussion_r1642201778 as a starting point for that discussion. @fdwr please feel free to open a separate issue copying your fallback overview there and cross-link to #623.
This change only adds the
MLDeviceType
enum (currently implemented in Chromium-based browsers for early testing) and not yet quantize/dequantize operators, as those are valuable but technically orthogonal given they are also useful for GPU and given the enum alone is still useful for non-quantized models using float16 weights.Of the 4 proposals from https://github.com/webmachinelearning/webnn/issues/623, we're starting with the simplest #1 (simple enum with system-decided fallback), and if additional experience warrants more complexity, we'll re-evaluate the other options later.
Wording recommendations for device fallback are welcome. Neural processing units are novel compared with more generic compute devices like CPU's/GPU's, given their limited operator support and inability to execute the entire graph at once.
Preview | Diff