C-Loftus / talon-ai-tools

Query LLMs and AI tools with voice commands
MIT License
39 stars 14 forks source link

Ai dictation mode #48

Closed C-Loftus closed 3 days ago

C-Loftus commented 3 months ago
C-Loftus commented 3 months ago

@jaresty Opinions on this? I want to try and make it so that we can pass a lot of context to the model and that we can use Talon for the baseline speech to text and then we can still get the more specific formatting we want on stuff by using the model to fix up things like proper nouns and/or punctuation.

C-Loftus commented 3 months ago

We can also create model select or something similar to select a range in an editable text box by passing all the context to the model having it return the range, so we wouldn't need to highlight it. I think there is a ton of potential with accessibility APIs in general, but unfortunately this does mean some OS or beta/public release talon fragmentation.

jaresty commented 3 months ago

I think this is a great idea. One less step to correct dictation!

4b11b4 commented 2 months ago

This is a rough idea but is there someway to leverage the work from https://github.com/OpenInterpreter/open-interpreter @C-Loftus

C-Loftus commented 2 months ago

This is a rough idea but is there someway to leverage the work from https://github.com/OpenInterpreter/open-interpreter @C-Loftus

Just curious do you have specific features in that repo you are looking for? @4b11b4 I am somewhat familiar with that, but not the specifics. This repo should have many of the same features but for voice. Since Talon packages in general are intended not to use external libraries, I've implemented most stuff from scratch.

For context (either you or anyone viewing this, this PR is sort of blocked at the moment since it relies upon Talon's accessibility bindings which aren't really documented and have dependencies on an underlying Rust library that sometimes doesn't behave as intended. Without being able to use these apis to pass additional surrounding context, real-time AI dictation fixes aren't particularly useful and it is just better to use model fix grammar as it is currently implemented

Let me know if you have other ideas or I am overlooking something you think could help this situation

C-Loftus commented 3 days ago

Closing this since it isn't really practical imo. Better to just use copilot or codeium. And axkit handles simpler context aware punctuation well on its own for macos, which would've been a big use case for this.