multihead attention - no converter found

johndpope commented 1 month ago

❌ Validation exception on node 'MultiheadAttention': PyTorch op: MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True) ) Keras op: ChangeOrderingLayer(func=<function converter_MultiheadAttention..func at 0x75b6b7e9f380>) Input args: ('Tensor(shape=[4096, 1, 128], dtype=torch.float32)', 'Tensor(shape=[4096, 1, 128], dtype=torch.float32)', 'Tensor(shape=[4096, 1, 128], dtype=torch.float32)') Input kwargs: {} Output tensors: ['Tensor(shape=[4096, 1, 128], dtype=torch.float32)', 'Tensor(shape=[1, 4096, 4096], dtype=torch.float32)'] Exception: You called set_weights(weights) on layer "multi_head_attention" with a weight list of length 8, but the layer was expecting 0 weights. Provided weights: [array([[[-0.05060041, -0.01487129, 0.10044055, .... Traceback:

❌ Validation exception on node 'MultiheadAttentionModel': PyTorch op: MultiheadAttentionModel( (multihead_attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True) ) ) Keras op: <nobuco.layers.container.TransientContainer object at 0x75b6b7d08290> Input args: ('Tensor(shape=[1, 128, 4096], dtype=torch.float32)',) Input kwargs: {} Output tensors: ['Tensor(shape=[1, 128, 4096], dtype=torch.float32)'] Exception: You called set_weights(weights) on layer "multi_head_attention_1" with a weight list of length 8, but the layer was expecting 0 weights. Provided weights: [array([[[-0.05060041, -0.01487129, 0.10044055, .... Traceback:

[Nobuco] Converting (DONE): |████████████████████████████████████████████████████████████████████████████████| 26/26 ops [00:00] Legend: Green — conversion successful Yellow — conversion imprecise Red — conversion failed Red — no converter found Bold — conversion applied directly

— subgraph reused Tensor — this output is not dependent on any of subgraph's input tensors Tensor — this input is a parameter / constant Tensor — this tensor is useless

johndpope commented 1 month ago

sample code is in https://github.com/AlexanderLutsenko/nobuco/pull/64

is this a known issue - or just something that's slipped by without anyone needing?

johndpope commented 1 month ago

basically crafted some keras / torch classes - mostly working now https://github.com/johndpope/IMF/blob/feat/tensorflow-cips/tf-export2.py

AlexanderLutsenko commented 1 month ago

Exception: You called set_weights(weights) on layer "multi_head_attention_1" with a weight list of length 8, but the layer was expecting 0 weights.

Ah, I see. Some Keras layers do not initialize their weights until the first forward pass. If that's the case, it needs to be done inside that specific node's converter. I'll take a look at it later.

AlexanderLutsenko / nobuco

multihead attention - no converter found #65