Open junosuarez opened 10 years ago
@jden I think this is a fantastic idea. Uptime and availability are the absolute baselines of service. We have a list of all of the sites compiled here over at the CitizenOnboard repo—where we hope to do UX critiques of all the SNAP enrollment process for all 50 states.
:+1: to this. It’d be fascinating to see how deep into the application process the tests could go, down to the level of seeing if any provider has a way to apply with a fake name for a smoke test.
Completely awesome idea. Feature request: Email mayor/congressmen/activist groups when critical services go down.
Additional benefit of this: if we can show real uptime (downtime) rates for these sites, the old argument of "we need a giant vendor for stable uptime!" loses a bit of its steam.
^ yes to that. For motivation, the main social services website in CA is currently down (3 hours + so far):
I'm down to take on the design and front end work.
status.citizenonboard.com?
On Wed, Nov 19, 2014 at 10:53 AM, Jake Solomon notifications@github.com wrote:
^ yes to that. For motivation, the main social services website in CA is currently down (3 hours + so far): [image: error] https://cloud.githubusercontent.com/assets/2533112/5112128/5734fcd8-6fda-11e4-9779-eb5c12d0739e.png
(http://www.mybenefitscalwin.org/)
— Reply to this email directly or view it on GitHub https://github.com/codeforamerica/project-ideas/issues/43#issuecomment-63692535 .
t: @alanjosephwilli p: 817 713 6264
Inception sound at CfA office when services go down.
Twitter bot tweet at vendor.
I'm getting started with Uptime now:
http://www.redotheweb.com/uptime/
On Thu, Nov 20, 2014 at 10:40 AM, Jake Solomon notifications@github.com wrote:
Inception sound at CfA office when services goes down.
Twitter bot tweet at vendor.
— Reply to this email directly or view it on GitHub https://github.com/codeforamerica/project-ideas/issues/43#issuecomment-63857142 .
t: @alanjosephwilli p: 817 713 6264
How was "down" determined in the initial survey of sites? HTTP status code? Was there a "sorry we're down for maintenance" note?
Operationalizing "a site is down" in this way will be necessary, so looking for details on what "down" meant in this case.
@daguar manually. Each was different, and yes typically there was a "maintenance" note.
Also here's our first day of MBCW uptime checking. :(
v0.1 here: http://stats.pingdom.com/29wya4enlbs2
Monitors the uptime for all 50 states' food assistance application web service, or the primary page hosting program information and downloadable forms.
We are checking for the presence of certain strings in order to ensure that we are not just getting an error page. However, we encountered some services, like Tennessee's, that are indeed down, but have semi-permanent pages in place offering guidance that the service is unavailable.
Update on United Status of America (ehh??) x-validation Currently running identical keyword status checks for 8 states + FB, Goog, and Twitter using 3 services:
Pingdom and StatusCake have 1 min check resolution with 30 sec timeout.
UptimeRobot is 5 min check resolution with 120 sec timeout (neither configurable).
So far it looks like there are some pretty big differences between these services...hmmm, glad we're doing validation!
Another update!
Idea: Screenshot + Tweet img whenever a site goes down using Websnapr (or something else http://stackoverflow.com/questions/1981670/programmatically-get-a-screenshot-of-a-page)
Well, Websnapr might be a non-standard unless you can turn off caching:
The added benefit is that they have most popular URLs cached, so you will get very fast response times.
But you could probably right something up with Phantom really easily. It's API let's you grab an image.
*non-starter, not non-standard.
Anyway, simple snapshotting service: https://github.com/Mr0grog/PageSnap
(Also running for now, unprotected, at http://pagesnap.herokuapp.com/[url].png, e.g. http://pagesnap.herokuapp.com/http%3A%2F%2Fheroku.com.png to snapshot heroku.com)
Ahh that's awesome. If you want to keep playing around here I can give you our statuscake API https://www.statuscake.com/api/ key. It basically gives you access to full history of data as shown on the alpha dashboard: http://uptime.statuscake.com/?TestID=pQYuAW4tAi
Maybe a Pinterest gallery of all the SNAP applications home screens that are down?
On Mon, Dec 8, 2014 at 1:59 PM, Rob Brackett notifications@github.com wrote:
Anyway, simple snapshotting service: https://github.com/Mr0grog/PageSnap
(Also running for now, unprotected, at http://pagesnap.herokuapp.com/[url].png, e.g. http://pagesnap.herokuapp.com/http%3A%2F%2Fheroku.com.png to snapshot heroku.com)
— Reply to this email directly or view it on GitHub https://github.com/codeforamerica/project-ideas/issues/43#issuecomment-66195655 .
Hmmm… actually thinking about this now, what’s the value in screenshotting here? I haven’t been following this thread deeply; I mostly just reacted when I saw the note about caching and thought “hey, that’d be easy and quick to do better for our use case.” Does anybody really need to know what “down” looks like? (why?)
Happy to keep poking at this a bit (fair warning, I’m bouncing all around various parts of New England right now, so my time with an internet connection can be limited), but it seems like charting uptime or aggregating statistics or visualizing it better than the StatusCake page would be more useful. This comes back around to @alanjosephwilliams volunteering to design: what are we actually trying to do with this?
Anyway, there may be in-person/office/other-channel conversations I haven’t been in that give more context that I’m missing. But I don’t want to dive into more work here without feeling like I know what we’re actually trying to do. Otherwise I feel like I’m wasting time feeling around in the dark. (Which is not to say I can’t come up with my own ideas and thoughts here; I’m just miles and miles away from really knowing the SNAP space at all and other people on this thread are much more clued in to what would be needed/useful here.)
Sorry if that was a big pile of questions… I don’t think all of them necessarily need to be answered right now, I’m just wary of putting in much more work when I’m not at all sure a) what would be most valuable to do right now and b) where we’re going with this for now.
NO don't apologize for actually caring about why we are doing this in the first place! Thoughtful questions deserve thoughtful responses so sorry for the delay...I'll offer some initial thoughts now but I'm in Sacramento all day so will think this through in more detail tomorrow.
The Big Hairy Problem here is that a lot of social service websites are down all the time for a bunch of different reasons. So the goal for this project is to increase uptime for critical digital government services and/or reduce the pain of downtime. Our theory of change here is admittedly a bit handwavy but here are a few pieces:
Phew! Longer than I thought. I know this doesn't get us down to the level of specific features but hopefully this gets us closer. More tomorrow and beyond...
Cool, that helps a lot. I ought to have some time to poke at this if you want to send me the StatusCake API key. Maybe I’ll experiment with a few different approaches, though I think I’ll hold off on screenshot-related stuff for the moment.
@Mr0grog what's your email? (Or follow me and I'll DM you)
rob@robbrackett.com or rob@codeforamerica.org
Updated top to add status + current URL.
Synthetics (New Relic's monitoring service) has some interesting features like SLA reports:
...and the ability to write checks with scripted browser via virtualized Selenium browsers.
Definitely too much complexity to start but could be useful in the future.
Super rad — thanks so much for the offer @statuscake!
Just a heads up that this work is being pursued in the CitizenOnboard repo.
One sentence description
Yesterday, 10% of SNAP websites across the country were down or inaccessible; let's track and show that.
Link (more details/brain dump/alpha)
something like https://status.github.com/ showing a big red / green for whether the individual state sites are accessible (maybe tracking other metrics, like response time or error rate and maybe with a sparkline showing these metrics over time)
Project Needs (dev/design/resources):
Status (in progress, pie-in-the-sky)