ProjectPythia / cookbook-gallery

Root site for the ProjectPythiaCookbooks GitHub Pages
https://cookbooks.projectpythia.org
MIT License
0 stars 13 forks source link

Add status badge for the Pythia Binder #198

Closed brian-rose closed 1 month ago

brian-rose commented 3 months ago

Based on conversations at today's Community Meeting, here's a simple proposal to add a status badge to the Cookbook Gallery page for the Pythia Binder.

The status is actually reporting from Jetstream2 system diagnostics: https://docs.jetstream-cloud.org/overview/status/#

There may be situations where the Pythia Binder is down but this "Canary" is reporting healthy, but I think the reverse is unlikely (i.e. if the Canary is down, then our Binder should also be down).

github-actions[bot] commented 3 months ago

👋 Thanks for opening this PR! The Cookbook will be automatically built with GitHub Actions. To see the status of your deployment, click below. 🔍 Git commit SHA: 485f6e34b52f1623d216f9956bc4e0a40ec68732 ✅ Deployment Preview URL: https://ProjectPythia.github.io/cookbook-gallery/_preview/198

brian-rose commented 3 months ago

@dcamron @ktyle @jukent have a look at this and let me know if you have suggestions. I just took the Jetstream2 status badge and "rebranded" it as a Pythia Binder status badge. The big advantage is that this is a "one and done" edit since the badge is dynamic and shows the result of a nightly test. But it's not our test and may or may not end up being a useful indicator of the status of our Binder.

jukent commented 3 months ago

Thanks @brian-rose for doing this, I think it makes sense and is very helpful. I think it could be useful to add a blurb about how sometimes the binder is down due to planned maintenance or something encouraging people to revisit in a couple days.

ktyle commented 3 months ago

I think we need to have the "passed/failed" status be clearly tied to whether the binder.projectpythia.org server is reachable, rather than have it tied to a general JS2 status. It's quite possible that the server might be down while JS2 as a whole is running as normal. That would the first step ... even better would be to devise a workflow that would actually execute a trivial notebook and verify that it runs to completion. Recall a few months ago that although JS2 was in an optimum state, and binder.projectpythia.org was reachable, our Binder instance was not properly communicating with its associated Jupyterhub.

jukent commented 3 months ago

@ktyle So perhaps 2 status badges? One for JS2 status, and one for our binder?

dcamron commented 3 months ago

I think the most important information is just the Pythia Binder status, but this badge was borrowed from the JS2 folks as it was the easiest thing to get out there quickly.

We should plan to create our own regular action to report Pythia Binder status via binderbot or the binderhub API, though I personally prefer to have this over nothing until that is put together.

brian-rose commented 3 months ago

Yes I think this is a temporary "Better than nothing" solution. We can leave an issue open to devise a better long term solution, quite possibly with 2i2c help since they presumably have a lot of expertise with monitoring binderhub status.

I'd like to wait and see how long it takes under the current conditions for the badge status to change, now that our hub is back up and running. To me that's a test of whether this borrowed badge is "imperfect but good enough"

brian-rose commented 2 months ago

Update: Our Binderhub has been back up for several days but the Jetstream2 badge is still reporting "failed" because the pipeline hasn't been run again: https://gitlab.com/jetstream-cloud/canary/-/pipelines

My takeaway is that this is not a good strategy for reporting the health of our Binderhub. I think we should close this PR and open an issue about setting up our own reporting pipeline run.