Saik0s / Whisperboard

The open-source iOS app that's making quality voice transcription more accessible on mobile devices.
GNU General Public License v3.0
751 stars 78 forks source link

Suggestions #5

Open ahhyeah opened 1 year ago

ahhyeah commented 1 year ago

Great app, I would pay for this! A couple of ideas: 1. Allow for import of existing voice recordings from the native iPhone app. 2. Ability to see the transcribing and ability to copy and paste from it while it's still recording. 3. Allow for saving and importing files from both the native files app and dropbox. Thanks for putting this together!

Saik0s commented 1 year ago

Hi @ahhyeah, thanks for the feedback!

New version is in progress and it will include at least importing and exporting audio. The realtime feature is still an open question, will do my best.

I am also working on a new UI. I am trying to decide what information/buttons to put in each card on "main screen", and what to put into new "recording details screen". Do you think that "main screen" should show transcribed text or the text should be in "recording details screen"? Or "main screen" shows only several lines and details screen shows whole text?

ahhyeah commented 1 year ago

I like the idea of just showing a few lines of text as a preview of a recording. That would be very helpful. I think copy and share are nice to have on the main screen. And ability to edit title. I assume the refresh icon runs the transcribe again? Based on current model selected? That could probably be in the detail screen. I like how you have a lot of it now. It's simple, clean.

Another thought is to add a tiny description to each model. I'm pretty techie and follow AI news and I don't know the different between all of the models. I just chose the larger one assuming it's more accurate. Thoughts?

If you're looking for more ideas, I will keep sharing. Not sure if you're long term goal is to charge or not but some features could be an "in app purchase" for a pro version. So, another feature could be that when you play back audio, the corresponding word would be highlighted, or stand out somehow (color change?) as it's being played back. Sorry, I'm not a coder, so that seems really difficult. Just throwing out ideas!

Another: Ability to record when the app is closed. Then, interacting with the dynamic island like the voice memo app does. Another : Using ChatGPT type AI to summarize the voice recording and create a title along with using the current location.

Saik0s commented 1 year ago

Hey, thanks for the feedback and suggestions! I will definitely think about allowing title edits on main screen. And yes, the refresh icon re-runs the transcription with current model.

Adding descriptions for Whisper models is a good idea, and I appreciate the feature suggestions you mentioned. Some might be challenging to implement, but I will consider them for future updates.

Feel free to keep sharing ideas, they are super helpful. Thanks again for your support!

Mario03482 commented 1 year ago

You could add a widget for the Lock Screen to easily access the app

Saik0s commented 1 year ago

A new version is now available on the App Store which has a lot of improvements including importing and sharing audio files.

mhauken commented 1 year ago

@Saik0s Just tried it! Works great, but miss the option to select the app from the Share sheet. From Voice memos you now have to save it in files and then open the file in WhisperBoard.

Mario03482 commented 1 year ago

@mhauken maybe you have to add the app Whisperboard to the share sheet.

Did you try whisperboard with a long audio? I would like to know how it behaves with long audios, cuz I can't test it at the moment.

mhauken commented 1 year ago

No. I can't find it in the share sheet (or when you tap edit there).

It seems to work perfectly for long audio as well. I tried adding a 40min audio and it seems to work flawlessly.🙌

Mario03482 commented 1 year ago

No. I can't find it in the share sheet (or when you tap edit there).

It seems to work perfectly for long audio as well. I tried adding a 40min audio and it seems to work flawlessly.🙌

Did you test with the large model? How much time did it take?

ahhyeah commented 1 year ago

I tried an 8 minute file by exporting from Voice Memos app to Files and then Files to Whisperboard and it got stuck on "transcribing" I have the larger language model. I let it sit for 20-30 minutes and there was no progress. I can try again.

**update: I opened the app back up and the transcription was there... Weird 🤷🏻‍♂️

Saik0s commented 1 year ago

There is indeed an issue with properly displaying the current state of transcription. Going to fix it asap.

flapee commented 1 year ago

I utilize GPT-4 to refine my transcripts, which significantly enhances them. Additionally, I employ GPT for translating the original content into English (trust me, there's You don't want to use Whisper for English translations).

Ideally, one would receive real-time transcription, correction, and translation As a side note, when working with non-English languages, it's essential to utilize medium model at a minimum.

Werner602 commented 1 year ago

Hi, I have tried two different models and it does not seem to work. I don't have the ability to record. I have tried in my iPads settings app and the whisperboard app does not appear there. Running iOS 16.2...

I Have used whisper in python on my Mac previously, and it is really great.

Mario03482 commented 1 year ago

I utilize GPT-4 to refine my transcripts, which significantly enhances them.

Additionally, I employ GPT for translating the original content into English (trust me, there's You don't want to use Whisper for English translations).

Ideally, one would receive real-time transcription, correction, and translation

As a side note, when working with non-English languages, it's essential to utilize medium model at a minimum.

Do you use a specific prompt on chatgpt?

flapee commented 1 year ago
I will give you a whisper transcript of a recording.
There might be some typos, mistakes in transcription. 
Fix the transcription errors, 
and then translate to english:

working on highlighting the changes made by GPT, but ATM some are ommited

Saik0s commented 1 year ago

Hi @Werner602, I released a new update, can you try it to see if the problem is gone?

flapee commented 1 year ago

What about srt/subtitles output format, at least rought timestamps at the level of chunks