Conversational transcription with Strapi, Whisper and GPT

gitChimp88 commented 1 month ago

What is your article idea?

Description -

Three part series on how to build a transcription app which listens in to video calls from speaker output and transcribes the audio and then provides context / answers to questions etc, I would like to start from the beginning and go through to deployment.

Part 1: Introduction and setup -

Introduction to the Project: Introduce the project and it's objectives, Explain the importance of transcription apps in enhancing communication.

Technology Overview: Discuss the technologies i'll use, such as Whisper, ChatGPT and Next for the frontend, also how Strapi will integrate with this project.

Set up project: Start setting up the project with tutorial on how to get started with Strapi and Next.js

Part 2: Implementation and development -

Capturing audio: Tutorial walkthrough on how to capture audio sing relevant API's or library.

Transcribing: Using Whisper to transcribe captured audio.

Building the Frontend: Walk readers through building the frontend using Next.js. Design the UI for displaying the transcribed text and any additional features.

Save captured audio history: Use Strapi API to save transcribed audio so users can look over their history.

Part 3: Advanced Features and Deployment -

Adding Context and Answering Questions: Enhance the transcription app by adding features like context analysis and question-answering capabilities. Discuss the implementation of these features.

Testing and Debugging: Cover testing methodologies and best practices for ensuring the reliability and accuracy of the app. Address common debugging issues.

Deployment: Guide readers through deploying the app to a hosting platform, such as Heroku for strapi and Netlify for the frontend.

Conclusion and Next Steps: Summarise the series and its key learnings. Provide resources for further learning and invite readers to explore additional features or improvements for the app.

What are the objectives of your article?

To show users how they can integrate cutting edge technologies such as Whisper and GPT with Strapi and build useful applications, Also to show a project from start to finish including some testing and debugging through to deployment showing how to deploy Strapi to Heroku and a frontend to Netlify.

What is your expertise as a developer or writer?

Intermediate

What type of post is this?

Tutorial

Terms & Conditions

[X] I have read the Write for the Community program guidelines.

gitChimp88 commented 1 month ago

@Theodore-Kelechukwu-Onyejiaku, Okay it can be done as a three part series the articles will just be a little longer and contain multiple topics, let me know what you think of the outline.

Theodore-Kelechukwu-Onyejiaku commented 1 month ago

Thank you @gitChimp88 ,

Please proceed!

gitChimp88 commented 1 month ago

@Theodore-Kelechukwu-Onyejiaku Part 1 is ready - https://hackmd.io/103bOOVRRtKS4rIF17dTaw?view

Theodore-Kelechukwu-Onyejiaku commented 1 month ago

Hi @gitChimp88 ,

Thanks for your contribution. This is too short for a part series.

gitChimp88 commented 1 month ago

Hey @Theodore-Kelechukwu-Onyejiaku let me know roughly how many words each part of the series should be and i'll change the outline to accomodate that.

Theodore-Kelechukwu-Onyejiaku commented 1 month ago

Hi Mike,

You can check out some of our blog series to see how they are. These ones might help:

https://strapi.io/blog/how-to-build-a-notion-clone-with-strapi-v4-and-next-js-part-1-of-2

https://strapi.io/blog/epic-next-js-14-tutorial-part-8-search-and-pagination-in-next-js

https://strapi.io/blog/building-an-ecommerce-website-with-jekyll-strapi-snipcart-and-tailwind-css-2-of-5-1

gitChimp88 commented 1 month ago

Hey @Theodore-Kelechukwu-Onyejiaku Okay I included some more set up with Strapi to lengthen the first part, let me know if that's okay.

Theodore-Kelechukwu-Onyejiaku commented 1 month ago

Hi @gitChimp88 ,

Thank you! The new content is great! But, wouldn't it be great if the audience or reader has something they can start to play with? Also, I noticed you are deploying to Heroku. Please is there a reason for that? Tried Strapi Cloud?

gitChimp88 commented 1 month ago

@Theodore-Kelechukwu-Onyejiaku I see what you mean, maybe I can change the structure and start with the frontend implementation transcribing audio with Whisper? and I can use Strapi cloud for the deployment that probably makes more sense.

Theodore-Kelechukwu-Onyejiaku commented 1 month ago

Hi @gitChimp88 ,

Excellent! Thank you!

gitChimp88 commented 1 month ago

@Theodore-Kelechukwu-Onyejiaku Okay the first part is ready to be reviewed!

Theodore-Kelechukwu-Onyejiaku commented 1 month ago

Thank you @gitChimp88! If I need anything from you I will let you know.

Theodore-Kelechukwu-Onyejiaku commented 1 month ago

Hi @gitChimp88 ,

Could you please provide a GIF for the part 1? Thank you!

This is also the goal of part 1: "Walk readers through building the frontend using Next.js. Design the UI for displaying the transcribed text and any additional features.".

Since this was supposed to be concerned about the UI of the app, is there no dashboard, are there no other pages, are there no fancy UI components. Please let me know.

I would suggest creating the whole UI or intro to the interfaces users can begin with along with the GIF or a demo showing the initial transcribed audio.

Please let me know if you are willing to implement this. And please, if you have any suggestions let me know.

Thanks for your contribution!

Theodore-Kelechukwu-Onyejiaku commented 1 month ago

Hi @gitChimp88 ,

Could you please provide a GIF for the part 1? Thank you!

This is also the goal of part 1: "Walk readers through building the frontend using Next.js. Design the UI for displaying the transcribed text and any additional features.".

Since this was supposed to be concerned about the UI of the app, is there no dashboard, are there no other pages, are there no fancy UI components. Please let me know.

I would suggest creating the whole UI or intro to the interfaces users can begin with along with the GIF or a demo showing the initial transcribed audio.

Please let me know if you are willing to implement this. And please, if you have any suggestions let me know.

Thanks for your contribution!

gitChimp88 commented 1 month ago

Hey @Theodore-Kelechukwu-Onyejiaku, Okay I can implement the dashboard and the transcribed text UI along with a GIF, what software are you using to create GIFS?

Theodore-Kelechukwu-Onyejiaku commented 1 month ago

Hi @gitChimp88 , thank you!

Currently, there is no specified one. But personally, I use https://ezgif.com/.

gitChimp88 commented 1 month ago

Hi @Theodore-Kelechukwu-Onyejiaku, Okay I have the first part finished now I think, hackmd wouldn't let me upload a gif but here it is, let me know if that's ok? - CPT2406051658-1209x575

Theodore-Kelechukwu-Onyejiaku commented 1 month ago

Thank you @gitChimp88 , you are doing great!

Meanwhile, I would love to inform you that for us to start publishing this series, we would love it if the the series are complete.

Please let me know if you have any questions. Thank you for your contributions once again!

gitChimp88 commented 4 weeks ago

@Theodore-Kelechukwu-Onyejiaku No worries I will finish it and get in touch when it's ready to be reviewed.

gitChimp88 commented 3 weeks ago

@Theodore-Kelechukwu-Onyejiaku Here's part two of the tutorial - https://hackmd.io/-4qPQ5nwQwOgve2P9O4nHA?both, Just want to check i'm on the right track - I plan part three to include connection to chatGPT with analysis etc, some refactoring (thought it might be cool to show a thought process on this) a little testing and then deployment to Strapi cloud, what do you think?

Theodore-Kelechukwu-Onyejiaku commented 3 weeks ago

Hi @gitChimp88, thanks a lot!

I just skimmed the par 2. I noticed you didn't include images to allow your audience follow through with the tutorial.

What is the difference between Part 1 and 3? This was for part 3 : "Enhance the transcription app by adding features like context analysis and question-answering capabilities. Discuss the implementation of these features."?

gitChimp88 commented 2 weeks ago

@Theodore-Kelechukwu-Onyejiaku, I will add images to part 2 to help user follow along. Part 1 includes the transcription and the UI but the analysis and question-answering capabilities are hard coded to show the UI, the backend connection to chatGPT hasn't been established and it's not connected to the frontend also part three will show some refactoring, testing and deployment of back and frontend, please confirm if this is okay?

Theodore-Kelechukwu-Onyejiaku commented 1 week ago

Hi @gitChimp88 .

That is ok. Thanks!

I just recalled that it would be great you show the audience the DEMO of what they will be building. Perhaps an embedded video or GIF would do.

gitChimp88 commented 1 week ago

Hey @Theodore-Kelechukwu-Onyejiaku!

Here's the third and final part - https://hackmd.io/AYSEgKu7QVOf2YOGOVtz2g

I have added google drive links to gifs in the articles (files were too big to upload to hackmd)

Let me know what you think!

strapi / community-content