identify persons in audio

linear[bot] commented 1 month ago

NicodemPL commented 1 month ago

I tried myself with paynnote audio - this works pretty good. fully local. but python based. https://github.com/pyannote/pyannote-audio

highly recommended - tried different systems (also paid) and this one is really efficient and delivers good quality.

On top of this - once you have separate audio - microphone and display this is even more promising for better quality meeting notes. I've developed a Python tool that combines Whisper transcription and Pyannote diarization to create comprehensive meeting transcript. This automated system transcribes audio, identifies speakers, and integrates the results, laying the groundwork for AI-assisted prompt for good notes generations. Still got some issues on my side but its basically working and 100% local. So this is doable for sure. And it beats Rewind.ai / Limitless for sure :) Locally.

louis030195 commented 1 month ago

/bounty 100

definition of done:

screenpipe-audio has some code that identify speakers - i guess after the transcription?
this is sent to the screenpipe-server which would insert speakers into DB
which is then returned in db queries & api

rules:

use rust, local
do not use too much compute, screenpipe must still be usable on normal consumer hardware (eventually if it's possible to use GPU/NPU...)
ideally separate file in screenpipe-audio
works on all OSes

algora-pbc[bot] commented 1 month ago

💎 $100 bounty • Screenpi.pe

Steps to solve:

Start working: Comment /attempt #306 with your implementation plan
Submit work: Create a pull request including /claim #306 in the PR body to claim the bounty
Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to mediar-ai/screenpipe!

Add a bounty • Share on socials

mediar-ai / screenpipe

identify persons in audio #306

💎 $100 bounty • Screenpi.pe

Steps to solve: