lepisma / emacs-speech-input

Set of packages for speech and voice inputs in Emacs
GNU General Public License v3.0
32 stars 3 forks source link

Reintroduce explicit command mode? #6

Open edgar-vincent opened 2 months ago

edgar-vincent commented 2 months ago

Hi!

Thanks for this lovely package.

As you have mentioned on your blog post, you removed the explicit command mode in https://github.com/lepisma/emacs-speech-input/commit/d80338abef2bd0e2e92f3c73596b2f51d7b2b16f, which allowed one to separate dictation and edits.

Edit requests don't seem to work at all here, not matter how much I refine esi-dictate-llm-prompt and esi-dictate-fix-examples. They are simply added to the transcription. Presumably, it is because I cannot use gpt4o-mini for very long (or at all?), since I use only use OpenAI's free tier.

What do you think about the possibility of reintroducing the explicit command mode? It could be optional, and thus help users that need it, without interfering with other users' workflow.

Thanks again,

EV

lepisma commented 1 month ago

Hlo,

  1. I think there might be a bug with the hook not executing which might lead to the edit issue that you are facing. I have also seen this happening and will try to get back on this. One hypothesis is that the speech_final flag from ASR is not coming (probably due to noise?).
  2. You can run the command esi-dictate-fix-context manually at any time to do LLM calls on the current context (highlighted by underline). But I believe you are talking about helping the LLM by providing the command separately like in the older explicit command mode. I will see if I can get that back up whenever I get time.
lepisma commented 1 month ago

For my first point in previous comment, I think I have found the issue. Deepgram is giving utterance end event but not setting the speech_final flag recently. This is causing the llm command to not get triggered. Will get to this.