pathwaycom / pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
https://pathway.com
Other
4.34k stars 139 forks source link

Introducing Pathway Guru on Gurubase.io #72

Closed kursataktas closed 2 weeks ago

kursataktas commented 1 month ago

Hello team,

I'm the maintainer of Anteon. We have created Gurubase.io with the mission of building a centralized, open-source tool-focused knowledge base. Essentially, each "guru" is equipped with custom knowledge to answer user questions based on collected data related to that tool.

I wanted to update you that I've manually added the Pathway Guru to Gurubase. Pathway Guru uses the data from this repo and data from the docs to answer questions by leveraging the LLM.

In this PR, I showcased the "Pathway Guru" badge, which highlights that Pathway now has an AI assistant available to help users with their questions. Please let me know your thoughts on this contribution.

Additionally, if you want me to disable Pathway Guru in Gurubase, just let me know that's totally fine.

CLAassistant commented 1 month ago

CLA assistant check
All committers have signed the CLA.

dxtrous commented 1 month ago

Hey @kursataktas, kudos on the gurubase project. We have tested the "Pathway guru" and the first results are pretty encouraging. Could you please clarify the following:

  1. How / when does your system update when the connected codebases & documentation change?
  2. Who would be expected to maintain the Pathway guru - would any input from our team be needed?
  3. What is the policy for using gurubase outputs? Are the links to generated articles permanent (frozen, kept up to date)? Can the generated content be used freely (respecting the original copyrights)?
kursataktas commented 1 month ago

Thanks for the review @dxtrous, Here are my answers:

  1. How / when does your system update when the connected codebases & documentation change?
  2. Who would be expected to maintain the Pathway guru - would any input from our team be needed?

We are currently developing an admin panel where maintainers can view and edit the sources their "guru" uses to generate answers. We will also introduce a change detection feature, which will refresh the sources whenever any are modified or a new version is released. Until we release these features, I manually maintain the Pathway Guru by monitoring this repository for updates. I’d be happy to notify you once the panel is ready for your use, in case you'd like to take over maintenance.

What is the policy for using gurubase outputs? Are the links to generated articles permanent (frozen, kept up to date)? Can the generated content be used freely (respecting the original copyrights)?

The generated articles follow the licenses of their sources. For more details, please refer to section 2 of our Terms of Use. At present, generated content remains unchanged, but we will regenerate them once the change detection system is fully implemented. We are also considering another approach where no generated content is stored at all, but this is still under discussion.

As many of my responses reference ongoing development rather than completed features, I completely understand if you prefer to wait before adding Pathway Guru to the README. I also appreciate your patience, as we released Gurubase only three weeks ago and it's still in its early days.

dxtrous commented 2 weeks ago

Hey @kursataktas let's move forward with this, why not. We do monitor outbound links occasionally, but please keep us posted of any breaking changes. Good luck with your project!

kursataktas commented 2 weeks ago

Hey @dxtrous

I noticed this commit removed the badge. Was that intentional, or was it a mistake?

dxtrous commented 2 weeks ago

Hey @kursataktas, definitely not intentional. Some glitch of our copybara setup for syncing with internal repos - will investigate. Thanks.

kursataktas commented 13 hours ago

Hi @dxtrous,

I’d like to update you on the release of the Maintainer Panel feature on Gurubase. With this panel, you can add, remove, or update data sources, as well as change the logo and more. You can find the details here.

In the near future, I’m planning to include analytics insights in this panel, such as the number of questions asked, the most frequently asked ones, and more. I’ll be sure to update this thread once it’s available. However, in case I miss it, I highly recommend joining our Discord channel to stay updated.

If you’d prefer that I don’t update this thread anymore, please let me know.