pinecone-io / pinecone-vercel-starter

Pinecone + Vercel AI SDK Starter
https://pinecone-vercel-example.vercel.app
418 stars 127 forks source link

Enhanced Document Tracking: Multi-Website Support and Continuous Relevancy Insights #17

Open HarounAns opened 11 months ago

HarounAns commented 11 months ago

Problem

In the existing design, visibility into the fetched documents is limited to instances right after the index has been freshly seeded. This approach was restrictive as it only catered to one website at a time, which posed challenges in comprehensive tracking and management. There was no provision to determine the relevancy of documents outside of this immediate post-seeding phase. With the new design, we have broadened our scope by accommodating multiple websites simultaneously. Furthermore, it provides insights into any document deemed relevant, eliminating the constraint of relying solely on the most recently seeded ones.

Solution

Implemented a relevantDocs section that showcases the documents fetched across various websites. This enhancement provides more transparency, ensuring that the user can see which documents are being retrieved, regardless of the recent actions with the crawler.

https://github.com/pinecone-io/pinecone-vercel-starter/assets/20847319/b07ca793-c304-449b-bcce-64991d3b0e7b

^ Notice how the fetched documents come from different websites!

Type of Change

Test Plan

  1. Navigate to the new relevantDocs section.
  2. Ensure that the fetched documents across different websites are being displayed.
  3. Test fetching documents after both loading and not loading the index via the crawler.
  4. Confirm that the displayed documents in the relevantDocs section are consistent with the expected results based on the recent actions with the crawler.
HarounAns commented 11 months ago

@rschwabco please let me know your thoughts

HarounAns commented 7 months ago

@rschwabco do you feel like this PR has any value. If not I can close it