Open bonesmoses opened 5 days ago
Thanks @bonesmoses . I agree with assessment and the potential solution.
The only somewhat involved change I can think of would be is to make vectorize.rag()
multi-column aware for retrieval. I think it would be acceptable to concatenate the columns used in vectorize.table(), since that is how it handles preparing the text before they are transformed to embedding. for example, columns => ARRAY['product_name', 'description']
would end up something like "map symbolic representation of a place" during a vectorize.rag() call on a multi-column vectorize.table(). These changes will need to be made somewhere near here.
The SQL function call for
vectorize.table
looks like this:While
vectorize.init_rag
looks like this:Note the differences:
agent_name
rather thanjob_name
table_name
rather thantable
unique_record_id
rather thanprimary_key
column
as a singleTEXT
rather thancolumns
as aTEXT
array.Similar differences exist between the
search
andrag
functions.agent_name
rather thanjob_name
There's also the naming scheme inconsistency between the two methods in general, i.e.
vectorize.table
rather thanvectorize.init_table
.Initial investigation suggests
init_rag
is essentially a wrapper fortable
. One potential solution is to deprecateinit_rag
entirely, as therag
function should work with any vectorized table with a corresponding project / agent / job name. Additionally, since theinit_rag
method came later and probably contains the "intended" nomenclature, back-porting this to thetable
call may be justified.