BitcoinDesign / Guide

A free, open-source community resource for designers, developers and others working on non-custodial bitcoin products.
https://bitcoin.design/guide/
Other
454 stars 96 forks source link

Evaluate video transcripts #1062

Open GBKS opened 10 months ago

GBKS commented 10 months ago

Andreas proposed creating transcripts of your videos via btctranscripts.com. Video content is currently not searchable, but we have a lot of interesting conversations. Transcripts could unlock this content, and also make it open for localization and use in tools like ChatBTC.

According to YouTube, our most viewed videos are the ones for the Mastering the Lightning Network reading group. I also think that the Learning bitcoin & design calls are really worthwhile content that stands the test of time and is helpful for many people learning this tech. So I proposed to Andreas that we do a trial run with those videos.

The process is that the videos are run through automated transcription software. Then, anyone is invited to review them on btctranscripts.com. We can also add them to the respective pages on the website as supporting material.

GBKS commented 10 months ago

Did a test using Whisper on the recording of our latest jam session. It was easy to setup, but took quite a few hours to run through the video. It generated several files, basically the same content (text and time stamps) in different formats. It captured the language really well. What's it does not do is speaker identification, so you can't tell from the transcript who is saying what. Investigating some other solutions for that...

bitcoindesign_2024-01-08T14_05_48.032Z.txt

kouloumos commented 9 months ago

We've build a tool for this job! You can find it at https://github.com/bitcointranscripts/tstbtc. That's the tool we are using to generate the AI transcripts for https://review.btctranscripts.com.

tstbtc supports whisper, but whisper is not good with diarization. At some point we plan to integrate whisper-diarization, but for now we are using deepgram for transcribing content.

GBKS commented 9 months ago

Nice! I did give whisper-diarization a try, but could not get it to work (messy dependency issues).

The cost for these paid services seems really low. For me personally, it might be more efficient to just pay rather than investing lots of time into getting a custom setup going. But looks like you are building a complete pipeline there, which is really cool.

mouxdesign commented 9 months ago

Adding in a Transcript that OtterAI transcribed for one of the UX research calls. Still has some mistakes with the terminology but can edit those. I have a year subscription as they are handy for recording user interviews etc. UX Research Call #34_ Etta Wallet

I am happy to do the UX research calls and proofread the transcripts.

kouloumos commented 9 months ago

Hey @mouxdesign, I'm currently in the midst of preparing some of the Bitcoin Design calls to be added to the queue (that means for them to show up in review.btctranscripts.com in order for users to review/edit them and then submit them for evaluation) so I should have something up, if not today, for sure within the next week. I'm doing some improvements in the postprocessing of the AI-generated transcripts, which is the reason behind the slight delay.

Currently, I'm the primary person managing transcript additions to the queue, utilizing tstbtc for transcription and then pushing them to the bitcointranscripts repo. Our goal is to decentralize this process, making it less reliant on any single individual.

So I'm keen to explore how we can establish a pipeline to facilitate the integration of OtterAI recordings into our review queue. Although OtterAI lacks an API, necessitating manual export, I could develop a script that converts these exports into the markdown format supported by bitcointranscripts. This could be a promising first step towards automation. I'll delve into Otter's export options and propose a viable solution soon.

FYI: In Otter, you can click on those speaker names and assign names to them - otter will replace the names on all the segments. Then on subsequent transcripts that have the same speakers, it will replace the names automatically.

Also, I really like how this transcript is shown in the page that you shared. Eventually we want to achieve a similar user experience for both the reviewers in review.btctranscripts.com but also for readers in btctranscripts.com.