We currently host our OpenaAI Whisper and BERT models on Huggingface using Inference Endpoints. This currently costs [todo: get price] per month. We should switch to a cheaper option.
I haven't run a detailed analysis yet on how much these other solutions will cost but a general rule of thumb is that the more work they do for you (the easier it is to setup), the more expensive it will be. So I'm assuming that any of the other options will be cheaper.
However, before building a solution, we should first get price estimates for how much each option will cost.
Note: I'm assuming (80% confident) that options 2-3 will be cheaper than option 1 which we currently use. However, I have low confidence (20% confident) that the order amongst 2-3 is correct. For example, Vast.ai might actually be easier to setup than an AWS EC2 instance and might also be cheaper.
Action Items
[ ] Estimate cost hosting the Hugging Face Inference endpoints on AWS Sagemaker, AWS EC2 Instance, Vast.ai
We currently host our OpenaAI Whisper and BERT models on Huggingface using Inference Endpoints. This currently costs [todo: get price] per month. We should switch to a cheaper option.
I haven't run a detailed analysis yet on how much these other solutions will cost but a general rule of thumb is that the more work they do for you (the easier it is to setup), the more expensive it will be. So I'm assuming that any of the other options will be cheaper.
However, before building a solution, we should first get price estimates for how much each option will cost.
In order of easy/expensive to hard/cheap:
Note: I'm assuming (80% confident) that options 2-3 will be cheaper than option 1 which we currently use. However, I have low confidence (20% confident) that the order amongst 2-3 is correct. For example, Vast.ai might actually be easier to setup than an AWS EC2 instance and might also be cheaper.
Action Items