Closed yizhexu closed 7 months ago
For the first problem, you can just copy whisper--temp-file
to wherever you want once the transcription is done. So the problem boils down to the question of how to run some custom elisp logic, once some other elisp function runs.
Typically library authors provide "hooks" where you can register your function implementing custom logic, which then gets run at predefined points. Here we provide whisper-pre-process-hook
and whisper-post-process-hook
which could be used to register various bits of custom logic (see the readme).
(As an aside, if library authors don't provide such hooks, then you can generally use Emacs' advice system (e.g. :after advice would run an advice function after some function runs). Unfortunately this doesn't quite work here because whisper-run
is async and immediately returns, so advice function would immediately run too, even before transcription is done)
While you can do the saving in the whisper-post-process-hook
, the second problem is slightly trickier. Every function in that hook is run with current buffer set to the temporary buffer that contains only the transcribed output (because this hook is meant for post-processing text only). However you can recover the original point location from the internal variable whisper--marker
which is a marker object.
Some code is probably easier understood than words. I think you want something like this in your config:
(defvar my-save-whisper-audio t)
(defun my-save-whisper-audio-clip ()
(when my-save-whisper-audio
(let* ((archive-name (format-time-string "%Y%m%d%H%M%S.wav"))
(archive-file (file-name-concat org-directory "recording" archive-name)))
(make-directory (file-name-directory archive-file) t)
(copy-file whisper--temp-file archive-file)
(with-current-buffer (marker-buffer whisper--marker)
(goto-char whisper--marker)
(when (eq major-mode 'org-mode)
(org-set-property "source" archive-file))))))
(add-hook 'whisper-post-process-hook 'my-save-whisper-audio-clip 100)
This worked great for me! Thank you for the explanations you provided.
Hi, I am interested in keeping the recorded files "/tmp/emacs-whisper.wav" instead of overriding them. I would like to have the option of using them later to do some more model training - eg. improve recognition by learning my accent.
In addition, I also want to add a org property to the transcribed string pointing which audio file it was generated from.
Can you give me some pointers on how to do this? I started by setting an alternative value to the "whisper--temp-file" like this
But when I try to record it just says "error in process sentinel: FFmpeg failed to record audio".