Open 7596ff opened 1 year ago
Great, can you share the docker build?
Is there an API available in the docker container that allows you to interact with whisper from outside the container?
I just ran the commands in the readme: https://github.com/ahmetoner/whisper-asr-webservice
The API is pretty simple, you upload the file and tell it which format to return.
Ah I was hoping there was an existing docker image you used to run whisper. In order to run the commands within the plugin, it's necessary to create a REST API within the docker container to interact with Whisper, like running the model and getting the result. A downloadable docker image will need to be created to make it distributable. That's a good chunk of work and it'll be a while until I get to it. If you'd like to contribute that would be great too.
If I'm reading you correctly, I don't think it's a good idea to automatically spin up a docker image from within obsidian. I think it would be best to require users to spin up what I linked themselves, which is quite easy with docker desktop. https://github.com/djmango/obsidian-transcription does this, but it's flaky from my testing. Not that I know anything about the internals of this plugin currently, but it seems moderate to run a command on an audio file that saves the json result next to it in-tree.
The plugin would never spin up a docker container automatically.
The overlap between the number of users who can spin up a docker container and use it, but who do not know how to install Whisper on their own machine is likely small, so the use case for using a docker container would be to support people who a) cannot install python and Whisper, b) can install docker and one image, and c) cannot interact with docker. So the workflow I would want to implement would be to support accessing docker via an API, which is necessary for the plugin anyway.
Writing an API will probably take 4 hours, and as you have seen from the other image it can be finicky. This is the majority of the work - create a container, preinstall Whisper, create a REST API, and publish the container. Hooking it up in the plugin will be straightforward after that. I likely won't get to this for a while.
Thanks for that clarification. I'm one of those people who can't install python and whisper, because I can not figure out python's dependency management and environments and so on and so forth. I guess I also don't know the difference between a docker container and a docker image.
Thanks for considering this issue.
I run whisper in docker and I would like to automatically generate transcripts through the plugin myself instead of having to do so manually.