Support caching/persistence of whisper models

owencking commented 2 months ago

Because

As of v10, when running the CLI version of the app, the app downloads the whisper model every time it runs. For example, when running with the "tiny" model, this is the content of stderr:

stderr:

  0%|                                              | 0.00/72.1M [00:00<?, ?iB/s]
  8%|██▉                                  | 5.77M/72.1M [00:00<00:01, 60.2MiB/s]
 16%|█████▉                               | 11.5M/72.1M [00:00<00:01, 55.9MiB/s]
 23%|████████▋                            | 16.9M/72.1M [00:00<00:01, 55.2MiB/s]
 31%|███████████▎                         | 22.1M/72.1M [00:00<00:00, 55.1MiB/s]
 38%|██████████████                       | 27.4M/72.1M [00:00<00:00, 53.0MiB/s]
 45%|████████████████▋                    | 32.5M/72.1M [00:00<00:00, 48.1MiB/s]
 52%|███████████████████                  | 37.1M/72.1M [00:00<00:00, 46.6MiB/s]
 58%|█████████████████████▎               | 41.6M/72.1M [00:00<00:00, 45.9MiB/s]
 64%|███████████████████████▋             | 46.0M/72.1M [00:01<00:00, 42.0MiB/s]
 70%|█████████████████████████▋           | 50.1M/72.1M [00:01<00:00, 39.1MiB/s]
 75%|███████████████████████████▋         | 53.9M/72.1M [00:01<00:00, 37.7MiB/s]
 80%|█████████████████████████████▌       | 57.5M/72.1M [00:01<00:00, 37.4MiB/s]
 85%|███████████████████████████████▎     | 61.1M/72.1M [00:01<00:00, 37.3MiB/s]
 90%|█████████████████████████████████▏   | 64.7M/72.1M [00:01<00:00, 37.2MiB/s]
 95%|███████████████████████████████████  | 68.3M/72.1M [00:01<00:00, 37.3MiB/s]
100%|█████████████████████████████████████| 72.1M/72.1M [00:01<00:00, 42.8MiB/s]

For the larger models (especially, "large" and "medium") this is substantial network overhead.

We once talked about allowing the docker container to use a persistent directory where models can be cached. This seems like a good idea.

Another option might be just to ship the docker image with all of the models pre-loaded.

Done when

The app does not need to download a particular model more than once on a particular machine/environment.

Additional context

No response

keighrim commented 2 months ago

As of SDK 1.2.3 (https://github.com/clamsproject/clams-python/pull/227), all containerized clams apps that use clams-python-xxx:1.2.3 (or newer) base images are using /cache as the common caching directory. Hence you can mount a new empty directory as /cache to download the models once and reuse across containers.

Also see https://github.com/clamsproject/clams-python/blob/5ed587679159f41b778d2106a7254f6eb764c0cf/documentation/clamsapp.md?plain=1#L90-L105 .

keighrim commented 1 month ago

closing as a non-issue.

clamsproject / app-whisper-wrapper