DevEmperor / Dictate

A powerful Whisper AI keyboard for reliable speech transcription
https://play.google.com/store/apps/details?id=net.devemperor.dictate
Apache License 2.0
12 stars 4 forks source link

Rewrite the dictation into properly written sentences. #2

Closed cryptoniooo closed 1 month ago

cryptoniooo commented 2 months ago

This is a great tool but it comes with a problem. You can dictate but if you use this with many other people there should be an option to plug it to OpenAI and have OpenAI revise the text and either put it as conversational, sarcastic, funny, summarize it, shorten it because it's not the same speaking as when we write and with that the tool can be used for day-to-day texting as other people would see and it would not make sense to text them exactly like we speak. I think this could make the tool the ultimate dictation tool in the market if this is fixed.

So basically we just need an option to turn on or turn off if we want to use this filter. It could be the text to write filter and then we turn it on and we decide what filter do we want. Do we want it to be summarized, make it more clear, rewrite, many other options that we can tell OpenAI what to do with what we just dictated and then immediately just paste it after we dictate without me having to right now dictate and then paste it to chatGBD and then paste it back to the chat with people as it does not make sense to dictate and just paste it. Like for example this whole text I've dictated it and certain parts do not make sense but the problem is you don't always have the time to go to chatGBD and edit so I think making it frictionless it could be really a revolutionary tool.

DevEmperor commented 2 months ago

Thank you very much for your feedback and this great idea. In fact, it has been on my to-do list for some time now to be able to edit the recognised text with various prompts after recording. That would be pretty much in line with your idea. My idea was that you could define different prompts yourself in the settings, where you could then translate, summarise or rephrase in a different style directly after recording, for example. Then you could also define a prompt that instructs ChatGPT to put the spoken text into a meaningful grammatical context. I will definitely be adding this feature in the near future. Stay tuned. :)

cryptoniooo commented 2 months ago

Hey, I was wondering, is there any way we could speed up this feature? Maybe we can align incetives here? This is actually super useful...

Sent from Proton Mail Android

-------- Original Message -------- On 7/11/24 1:10 PM, Jannis Zahn wrote:

Thank you very much for your feedback and this great idea. In fact, it has been on my to-do list for some time now to be able to edit the recognised text with various prompts after recording. That would be pretty much in line with your idea. My idea was that you could define different prompts yourself in the settings, where you could then translate, summarise or rephrase in a different style directly after recording, for example. Then you could also define a prompt that instructs ChatGPT to put the spoken text into a meaningful grammatical context. I will definitely be adding this feature in the near future. Stay tuned. :)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

DevEmperor commented 2 months ago

Hey, I also noticed how extremely amazing this feature would be, and trust me, I am currently putting all of my free time into this project, and specifically into this feature. The next update will be massive. I will keep you updated. :)

cryptoniooo commented 2 months ago

I just saw you sent an update and I was wondering what are the instructions for the prompt like I told it to make it sound professional but I don't I don't think he's working I'm using just the open AI APY do I need something else

Sent from Proton Mail Android

-------- Original Message -------- On 7/14/24 9:38 AM, Jannis Zahn wrote:

Hey, I also noticed how extremely amazing this feature would be, and trust me, I am currently putting all of my free time into this project, and specifically into this feature. The next update will be massive. I will keep you updated. :)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

DevEmperor commented 2 months ago

Hey, sorry for the misunderstanding at this point. The update to version 1.5 does not yet include the rewording feature you suggested. The style promp in the settings allows you to guide the model in a certain direction. For example, you can enter a random text where everything is lowercase and without punctuation, and Whisper will no longer add punctuation or capital letters in all future recordings. Here you can read more about this feature.

As mentioned, I am working very hard on the feature you suggested, which will be much bigger than what was included in the update to version 1.5. I am about 40% through the work, but I am making very good progress. Until then, I can only ask for your patience. I will notify you in this thread as soon as I am finished, even before the update is released on Google Play. :)

cryptoniooo commented 2 months ago

Okay, sounds good. I got confused for a second. No worries. Also, if you want to put on the roadmap or might be interesting for you to think about it now, and I don't even know if this is possible, but this is pushing it to the limit in the sense of making it really cool. You know how there is custom GPTs that we upload our information. So for example, I upload all my text messages from the past four years into a custom GPT PDF, train the model on it and then imagine what it's like if I could plug it in here. And then I can just, I speak and then OpenAI just spits out the text as I've read in the past years. And then I can do the same with any books, authors, or any really type of content. And then it styles it on that way that could open so many unlimited possibilities. I could even just speak and it would recite poems. I think it could be really cool, but I feel like that's pushing it because I don't know how well be the integrations for a custom GPT and that's more complex. And I think it's already pretty cool Already working on it.. So yeah, that was just an idea. I hope you're having a good day.

Sent from Proton Mail Android

-------- Original Message -------- On 7/16/24 8:06 AM, Jannis Zahn wrote:

Hey, sorry for the misunderstanding at this point. The update to version 1.5 does not yet include the rewording feature you suggested. The style promp in the settings allows you to guide the model in a certain direction. For example, you can enter a random text where everything is lowercase and without punctuation, and Whisper will no longer add punctuation or capital letters in all future recordings. Here you can read more about this feature.

As mentioned, I am working very hard on the feature you suggested, which will be much bigger than what was included in the update to version 1.5. I am about 40% through the work, but I am making very good progress. Until then, I can only ask for your patience. I will notify you in this thread as soon as I am finished, even before the update is released on Google Play. :)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

DevEmperor commented 1 month ago

Hey again, as you can see in these three commits, I have finally added this amazing feature. I've tested it a bit with some prompts (also the one you suggested) and this is really changing a lot. The possibilities are endless, and I want to thank you for this great idea. :)

I will release the update now, and within two to three days, you will be able to enjoy this too.

DevEmperor commented 1 month ago

Sadly, your idea with the custom GPTs is not possible, because the OpenAI API does not offer any interface for interacting with these GPT models. We can only use the public GPT-4 or GPT-3.5 Models. :)

cryptoniooo commented 1 month ago

Can you send me the apk directly my email so I can try it?

-------- Original Message -------- On 7/23/24 9:15 AM, Jannis Zahn wrote:

Hey again, as you can see in these three commits, I have finally added this amazing feature. I've tested it a bit with some prompts (also the one you suggested) and this is really changing a lot. The possibilities are endless, and I want to thank you for this great idea. :)

I will release the update now, and within two to three days, you will be able to enjoy this too.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

DevEmperor commented 1 month ago

Can you give me your email address? Then I will gladly send you the APK file.

cryptoniooo commented 1 month ago

@.***

On Jul 23, 2024, at 19:21, Jannis Zahn @.***> wrote:

Can you give me your email address? Then I will gladly send you the APK file.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

DevEmperor commented 1 month ago

Your email address has been obscured by GitHub for security reasons. I am simply sending you a one-time link to download. https://workupload.com/file/aSz36nmA2c3

cryptoniooo commented 1 month ago

Cool, I got it to work, but is it possible to not have to click the button to make it work and just automatically output and go through the prompt?

-------- Original Message -------- On 7/23/24 8:26 PM, Jannis Zahn wrote:

Your email address has been obscured by GitHub for security reasons. I am simply sending you a one-time link to download. https://workupload.com/file/aSz36nmA2c3

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>