video-db / Director

Director is an open source framework for creating AI agents to manage and interact with your media library.
https://director.videodb.io
65 stars 8 forks source link

View transcript agent #70

Open ashish-spext opened 1 week ago

ashish-spext commented 1 week ago

Confirm this is a new agent request

Describe the agent

This agent can simply get the transcript of the video and add it into the context for other agents.

Additional Context

No response

sarfarazsiddiquii commented 1 week ago

Hi @ashish-spext, Is this issue open for contributions? If so, can you assign it to me? I’d like to work on it.

ashish-spext commented 1 week ago

Sure @sarfarazsiddiquii

For v1 let's implement basic agent with

  1. Spoken Index Check:

    • The agent indexes spoken words if the index does not exists.
  2. Default Mode - Text Transcription:

    • It should send the transcription text as TextContent.
  3. Timestamp Mode (Optional):

    • If the user requests transcription with timestamps, it should group the transcript into specified time range (default 2 minutes).
      • It should format the above clubbed text with timestamps and send it as TextContent.
  4. Transcript Context for LLMs:

    • The agent should return transcript dictionary in its response.
sarfarazsiddiquii commented 6 days ago

Hey @ashish-spext, I made a pull request regarding this issue. Let me know if any changes are required. Also, I would love to contribute to other open issues as well.