Use multiple PDF files from insurance domain for chat with pdf
change title as suggested in mail. - Teradata Enterprise Vector Store : vectorizing PDFs
For chunking of pdf text, can you do in-db STO with python.
Use HF models for create embeddings via BYOM approach (parallel CPU inferencing)
Use 3rd party LLM (OpenAI/Bedrock/Gemini) for final answer
You will have the use HF model also for question --> embeddings
Also make some visualization (embedding to 2D) to show the selected chunk based on questions . I think scatter plot could be good which shows all chunks, question, and selected chunk
Store PDFs in object store or Vantage Table (pointing to object store)
No needs to add chat UI, Create pre-defined questions in a dropdown, and it can answer based on question selected.
New changes: