Closed choccccy closed 3 months ago
from #79:
It's been annoying me for a while that each of the metadata fields for our vector DB are a bit different: The ones in pgvector_data_doc use: tags TEXT[] -- tags associated with the chunk And pgvector_message uses: metadata JSON -- additional metadata of the message I think this second pattern is the one we should be using, so in effect it would just be an update to ogbujipt.embedding.pgvector_data_doc.
It's been annoying me for a while that each of the metadata fields for our vector DB are a bit different:
The ones in pgvector_data_doc use:
pgvector_data_doc
tags TEXT[] -- tags associated with the chunk
And pgvector_message uses:
pgvector_message
metadata JSON -- additional metadata of the message
I think this second pattern is the one we should be using, so in effect it would just be an update to ogbujipt.embedding.pgvector_data_doc.
ogbujipt.embedding.pgvector_data_doc
This change from lists of strings, to (JSON) dictionaries, is the whole branch.
This did come with the loss of tag based searching; it is expected that the user would implement this themselves, instead.
@choccccy I'm good to land this, though I still want to discuss whether we want to do any sort of migration, or just warn people about the need to rebuild their vDBs, and leave them to it.
from #79:
This change from lists of strings, to (JSON) dictionaries, is the whole branch.
This did come with the loss of tag based searching; it is expected that the user would implement this themselves, instead.