Closed betatim closed 6 years ago
Do you have any sense of what fraction of total operating cost 10Gig accounts for?
https://zero-to-jupyterhub.readthedocs.io/en/v0.5-doc/cost.html is pretty neat for going through different scenarios.
Some assumptions: mybinder.org has roughly 30-40 active users at its low point during a 24h cycle. In the recent past typical "spikes" take us to ~100 at most (modulo reddit hugs). Chart here. Our impression is that the cluster grows because we promise a fixed amount of RAM to each active user, not because they are maxing out the CPUs. I think this makes sense as notebooks typically require you to think a lot in between running commands. Ergo pick instance types that have more RAM per user instead of instances with lots of CPU per user.
Exploring the pricing notebook a bit my conclusion is: it doesn't matter what you choose for storage size or SSD vs HDD it will be nothing compared to the cost of the CPUs. 30 users for 30 days on n1-standard-8
instances with 10GB each -> $1.20 for storage/month. So even if you bump it up to 100GB per user it is $12/month.
Conclusion from Tim: don't worry about storage, more important to get the auto-scaler working well so the cluster is "small".
Agreed. This is trivial – no need to worry about "too much"! 10GB does seem likely to be plenty. And if it isn't, we'll know that there's little financial concern to increasing it. Most of OH's data storage costs are in the more permanent storage file storage and downloads, which have also been trivial so far.
Cool. I'll close this then. We can make a new issue to discuss changing away from 10GB at some point.
We currently assign 10Gig per home directory per user. Do they need this much? More? Less?