Closed greenw0lf closed 1 month ago
@greenw0lf k will check this out.
@greenw0lf I refactored a bit the code here:
w_model
and model_base_dir
imported from the cfg). This way the functions can be tested more easily.""
. ""
)Won't mind merging this sooner, but the error handling would be good to add before that. Unit tests I leave up to you, but since the functions are now quite isolated they should be easy to test
@greenw0lf oh yeah, could you also extend the main function call with a way to just download the model? This way we can also reuse the same docker image to just download the model into a shared volume (and after that startup one or more whisper services)
So, what I propose when it comes to determining the model used is the following:
check_model_availability
now becomesget_model_location
str
that contains either the path to the model (ifW_MODEL
is an HTTP/S3 URI) orW_MODEL
(ifW_MODEL
is a pretrained model version, such aslarge-v2
ortiny
)W_MODEL
is an URI, it will attempt to download it (it is expected to be all zipped in a.tar.gz
file) and save it in the/model
folder under a folder with the same name as the zip/tar file that was downloadedwhisper_custom.tar.gz
, the files will be extracted under/model/whisper_custom.tar/
/model
W_MODEL
and the model that was actually used, otherwise provenance would report wrong infoLet me know if something is missing or isn't explained properly