Open philloooo opened 3 months ago
Thank you for the implementation findings. Let me digest those fields again (ironically, I have a short-term memory when it comes to LSTM's details 😅).
@shiyi9801, do you think this would have also helped your implementation? I recall you doing some concatenation manipulation here https://chromium-review.googlesource.com/c/chromium/src/+/5320673 and here https://chromium-review.googlesource.com/c/chromium/src/+/5339174.
@shiyi9801, do you think this would have also helped your implementation? I recall you doing some concatenation manipulation here https://chromium-review.googlesource.com/c/chromium/src/+/5320673 and here https://chromium-review.googlesource.com/c/chromium/src/+/5339174.
No it actually will do the opposite for DML backend. If the weights/biases are passed separately for each direction then DML backend has to do another concatenation to combine forward and backward weights/biases. (The previous concatenation is to combine bias and recurrent bias) It looks like CoreML prefers separate operands for each direction while DML prefers them as a whole...
https://learn.microsoft.com/en-us/windows/win32/api/directml/ns-directml-dml_lstm_operator_desc
/cc @miaobin for Gru
The separate weights & bias might help simplify the emulation code, i.e. Chromium TFLite backend, that needs to slice each tensor from the combined one when using it.
/cc @fujunwei
hi! @huningxin @fdwr what do you think about this again? This will unblock CoreML implementation for lstm and gru by prevent needing to do constant folding or decomposition.
This is feedback from when trying to implement gru/lstm on CoreML driven by https://github.com/webmachinelearning/webnn/issues/689. The biases and weights are stacked together for forward and backward directions when it's bidirectional, similarly activations are passed as an array instead of distinct separate values params.
I think it's more explicit and cleaner to follow the CoreML's design which:
recurrent_activation
,cell_activation
,activation
.What do you think?
This also helps to unblock the lstm/gru implementation on CoreML from depending on the outcome of MLConstantOperand discussion.
@fdwr @huningxin