tl-its-umich-edu / annoto-gai

This is Github Project to Annoto GAI work
0 stars 2 forks source link

LangChain Implementation referencing incorrect sources #40

Closed takposha closed 3 months ago

takposha commented 4 months ago

The LangChain implementation functions correctly when run on a single file at a time. However, when set up to run over multiple files in a loop in a single script, the Chroma vector database that stores the transcript values does appear to reset itself correctly and instead appends to the database each time instead of creating a new database for each transcript. This results in questions generated referring to incorrect sources and being for the wrong transcript.

The fix for this is simple enough, by assigning a unique database name to be used for each transcript so that they are all unique and separate.

pushyamig commented 3 months ago

@takposha How Do I test this use-case?