Open rabernat opened 2 years ago
@rabernat you seem to use an outdated version of helm
, and you won't have a lot of the errors you see if you upgrade to helm
version 3.
So then I went to https://console.cloud.google.com/kubernetes/clusters/details/us-central1-b/binder/nodes?project=pangeo-181919 and manually resized all the node pools to zero. That seemed to work.
Nice!
I'm so upset about having to invest time and effort in this... Note that this is a related discussion: https://github.com/jupyterhub/team-compass/issues/478. Also note https://twitter.com/GeoffreyHuntley/status/1468040448316882946 as Geoff links to from that post.
So sorry to hear the platform has been abused! I'm afraid I don't have time or resource to help get things back up, but if you ever need help in a crunch moment like you had to try to get things down you can always call on me!
If you have the ability to update the DNS for binder.pangeo.io
I would create a GitHub repo with a simple landing page and use GitHub Pages to host.
So sorry to hear that :(
There should be a way to manually label nodes as NotReady
in the case you want to keep your deployment alive ? Or just un-installing binderhub following this: https://binderhub.readthedocs.io/en/latest/zero-to-binderhub/turn-off.html
Just noting that the AWS pangeo binder is still up. It uses github authentication, so we've thus far mostly avoided cryptominers (#188). Although the pangeo aws infrastructure is no longer supported and could disappear at any moment as well :(
I made this table on discourse a while back. Depending on your CPU, RAM, dask, and data location needs one of these alternatives might work...
BinderHub | vCPU | RAM (GB) | Cloud provider | Max Session (hr) | Dask-gateway |
---|---|---|---|---|---|
4 | 8 | Google us-central1 | 3 | yes | |
aws-uswest2-binder.pangeo.io | 4 | 8 | AWS us-west-2 | 3 | yes |
gke.mybinder.org | 1 | 2 | Google us-central1 | 6 | no |
ovh.mybinder.org | 1 | 2 | OVH ? | ? | no |
gesis.mybinder.org | 2 | 8 | Custom Server | 6 | no |
Just a quick note that, as @consideRatio mentions, the Binder team is thinking through these issues as well, and hopes to have a meeting with some others in the community to discuss potential ways around this: https://github.com/jupyterhub/team-compass/issues/478
There are plans for 2i2c to take over operations of the Pangeo Binder, we'll need to figure out our strategy around crypto mining then (otherwise this is going to be a constant source of extra labor). That's a conversation that should include leaders from the Pangeo world, since it might involve trade-offs about user experience vs. constraints for mining.
Thanks Chris! We would be fine with simply requiring sign-in to use our binder.
We would be fine with simply requiring sign-in to use our binder.
Just a heads up that the word from the folk who run GESIS Notebooks is that they still have issues with crypto-mining even with authentication, and they are actually shutting down the auth'd side of that service at the end of this month. So we will probably need auth and something else (maybe recaptcha?) to really tackle this. Hence the meeting that is taking place in the new year (since ideally we would like to avoid putting auth in front of mybinder.org)
A few weeks ago we started seeing these notices from Google CLoud
Dear Developer,
Our systems identified that your Google Cloud Platform / API Project ID pangeo (id: pangeo-181919) may have been compromised and used for cryptocurrency mining.
Crypto mining is a common problem on binder deployments, and it has finally hit us.
I ignored it for a while. We currently have no sysadmin for the binder cluster. It is running totally unsupervised. However, I recently checked the logs and noticed a huge spike in usage:
The binder has been in maxed-out state for quite a while, and is on track to cost thousands of more dollars per month than we are used to.
Resolution
I needed to try to shut down the binder as fast as possible. Unfortunately, my kubernetes / helm skills are very rusty. Here's what I tried:
First I updated my local gcloud and helm to latest versions (haven't touched either in over a year). Then
Then I tried helm
I couldn't get anything useful out of helm, so I gave up on it.
Then I went to kubernetes
This had no apparent effect.
So then I went to https://console.cloud.google.com/kubernetes/clusters/details/us-central1-b/binder/nodes?project=pangeo-181919 and manually resized all the node pools to zero. That seemed to work.
So the binder is currently completely broken and unusable. 😞 It would be nice to at least get a landing page up at binder.pangeo.io that explains the situation. I am worried that this will affect activities planned for AGU, but I'm not sure.
I think our best hope for reviving our binder would be when 2i2c can take on this deployment, but that likely won't happen until spring.