vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.73k stars 597 forks source link

Support tensor math expressions in indexing language #27822

Open sviatoslavp opened 1 year ago

sviatoslavp commented 1 year ago

Hey! I need to derive a weighted average tensor from several tensors (calculated from fields with onnx model) at indexing time e.g.

field emb1 type tensor<float>(x[512]) {
      indexing: summary
 }
field emb2 type tensor<float>(x[512]) {
       indexing: summary
  }
field emb3 type tensor<float>(x[512]) {
      indexing: summary
 }
field emb_v1 type tensor<float>(x[512]) {
      indexing: (input emb1 * 0.10) + (input emb2 * 0.20) + (input emb3 * 0.7) | attribute | summary | index
...

But tensor arithmetic is not supported now for indexing language.

It would be great to have the following operations supported

  1. Tensor and scalar multiplication e.g. input emb1 * 0.10
  2. Sum of tensors e.g. input emb1 + input emb2
  3. Linear combination e.g. (input emb1 * 0.10) + (input emb2 * 0.20)

I don't have usecases for the concatenation, but it may be useful to have it as well e.g.input emb1 . input emb2

This is not a show-stopper as I can calculate the linear combination outside Vespa and feed it, but this would make a solution more elegant

andreer commented 8 months ago

Another use case for this is dimensionality reduction such as slicing (for MRL embeddings) or random projection which can be expressed in tensor math