Closed tomusher closed 7 months ago
If you want to split differently for different indexes, you'd need to create multiple embedding backends.
In some ways, you say that you might split differently for different indexes, but then you move the responsibility onto the data object, rather than the vector index class. Is where this lives now a long-term solution? I don't actually fully understand where is the most useful location for this, so I am being very naive here.
In some ways, you say that you might split differently for different indexes, but then you move the responsibility onto the data object, rather than the vector index class. Is where this lives now a long-term solution? I don't actually fully understand where is the most useful location for this, so I am being very naive here.
As part of https://github.com/wagtail/wagtail-vector-index/pull/54 that logic now lives in the DocumentConverter
. Still not entirely sure that's the most logical place but at least it's somewhat composable.
Previously, the splitting behaviour was defined through settings on an embedding backend. If you want to split differently for different indexes, you'd need to create multiple embedding backends.
As the only thing that needs to be aware of how content is split is the
VectorIndexable
object itself, the logic has now moved there and any customisations can be made on the indexed type directly.