open-austin / project-ideas

:bulb: A place to collect ideas for Open Austin projects
183 stars 25 forks source link

Water Quality warnings bot #74

Open mateoclarke opened 8 years ago

mateoclarke commented 8 years ago

The repo for this project lives here.

What problem are we trying to solve?

I saw this headline: City of Austin warns residents against swimming in Bull Creek because of high bacteria levels

That made me think about civic apps I've seen from other cities: Chicago: Is there sewage in the river? Air Quality Bot For Slack With Breezometer

The problem: It is hard to know when swimming spots and other recreational areas are dangerous to health because of bacteria or other safety hazards.

Solution(s):

Research around what data is available, where the data is collected. How to measure if a place is safe or unsafe. What scale do you use to measure that?

What help is needed at this time?

Any and all help welcome. I have too many projects on my plate to hack on this now. Just wanted to toss this idea out.

mateoclarke commented 8 years ago

Recent related project out of Houston (Sketch City) https://twitter.com/kuukihouston

colbywhite commented 8 years ago

@mateoclarke do you know where that Kuuki bot gets its data from? I can't find the code in @sketch-city's codebase. It says TCEQ, but I didn't see a real friendly way to extract data out of the TCEQ website.

In general, you have an idea on where to get data like this?

I'm not sure if a Twitter bot is the best format for this. I'm not sure tweeting about every little warning is super useful. That Kuuki bot only has a 107 followers and has been tweeting about Benzene in Channelview & Galena Park since its inception. (I'm guessing they have yet to fix there Benzene issues?) Also the Austin Health and Human Services Dept already has a twitter handle and tweets about the big stuff. (The Bull Creek incident is in there.)

A map seems heavy handed as well since, in theory, only a couple of places at any given point in time will have a warning that prevents you from swimming.

Maybe a simple mobile-friendly app that has the current warnings in a table? Doesn't sound sexy but that's all that you really need. Add a twitter bot or mailing list to notify when one is added/removed maybe?

Quickly Googling this stuff made me wonder if this should just be expanded to health advisories in general. I couldn't really find a good list of the current health advisories. How does that dept. get this info out? Is it really just the call-the-newspaper strategy? That can't be right, so I'm assuming my Google terms are slightly off. I was searching around and couldn't really find out if that Bull Creek advisory was done with. Every article is about how they issued it around Jul 22. If there was an easy way to view what's current and what's done with, that could be helpful. But I don't know where you get the data for that.

woodb commented 8 years ago

If there are auto-samplers by the USGS in areas, those are great sources of data.

I'll do some looking around to see what's out there.

mateoclarke commented 8 years ago

@colbywhite, not sure where the Kuuki data comes from but @fileunderjeff would be a good contact. Maybe there is similar data available in Austin for air quality.

Austin Watershed Department collects data on Water Quality. Search "water" on data.austintexas.gov and there is a bunch of data.

I pulled out one relevant dataset: https://data.austintexas.gov/Environmental/Water-Quality-Sampling-Data/5tye-7ray

If you look at the PARAM_TYPE column there are a bunch of different measurement types. One of interest might be "Bacteria/Pathogens".

fileunderjeff commented 8 years ago

@colbywhite there are two kuukibot repos. This repo on bitbucket is for the scraper and database. It auto generates a URL, then parses the result and writes to a database. The data is publicly available as the result of this form on TCEQ's website, though Kuukibot takes hourly readings and sums up the last 24 hours trailing. This application is highly extensible across the state for air quality results, so if you want to set up an Austin bot, you could do it in a few minutes.

One other thing about Kuukibot: air science is hard. We partnered with Air Alliance Houston and a bunch of scientists to make sure we're using the numbers properly, and that we're using the right vocabulary when talking about "exceedences" and so forth. If you build a water quality monitor, I am happy to put you in touch with hydrologists and hydrogeologists at the US Geological Survey who can advise.

mudspringhiker commented 7 years ago

Hello,

I would like to work on this dataset, if it's still on the table. I found the parameter ("E COLI BACTERIA) in the water quality sampling data from data.austintexas.gov that might be needed for what you want to make. I uploaded some of my work on my Github (https://github.com/mudspringhiker/water_quality_austin/blob/master/waterqualityatx.ipynb) and tried to zero in on Bull Creek as an example. I have no background in creating a tweet bot, however. Also, I don't know what is the limit for a certain body of water to be deemed hazardous. From my google searches, a limit I found was 200 MPN/100mL (the Austin, TX dataset uses two units but it looks like they are just using MPN/100mL now--the two units can't be converted to another, it looks like). I have a background in chemistry so I think this dataset is a good for me to work on. Thanks.

Alona Varshal

mateoclarke commented 7 years ago

Hi Alona!

Thanks for commenting here with a link to your work! I'm not totally up to speed with Python Notebooks, so maybe you could talk me through what you did at an upcoming Open Austin event?

I'm thinking about how we could break this project into smaller pieces and recruit more help. These smaller pieces could be seperate services or part of one application that does the whole thing. Here are some ideas:

  1. An analyzer/scoring program that runs through the public data and scores each site/creek. It looks like that might be hard to do that as you pointed out the measurement type for E Coli counts in varying units. I'm thinking what would be nice is if the program was able to automatically:
    • download the input data from the City's data portal
    • evaluate each site and/or creek and give it some score or color assignment based on the safety of the water.
    • print out the scores in a new table
      1. A twitter bot program that looks at the output table we create and runs logic based on each creek and decides what to tweet out.
  2. A web app/visualization that lets you see the results of the output table based on a map of the waterways and/or a list of creeks you can filter/sort.

That might be over thinking it, but let me know what you think @mudspringhiker. I've just created a Slack channel so we can discuss and recruit more folks.

Best, Mateo

mudspringhiker commented 7 years ago

Hi Mateo,

Sure I can talk to you about what I did. I can't come to the meetup tomorrow, Wednesday. I'll check the schedule for the next one. I think your ideas above are all great!

Alona

mateoclarke commented 7 years ago

No worries, I have to miss Wednesday too. But we can start chating in the #waterquality channel and we can shoot for 2/21.

mateoclarke commented 7 years ago

Repo created: https://github.com/open-austin/water-quality 🎉

werdnanoslen commented 7 years ago

Another policy question for @amaliebarras @hlupico: idea has a project repo, so should this idea issue be closed?

amaliebarras commented 7 years ago

nah don't close it. this is still a good place to discuss

werdnanoslen commented 7 years ago

is it still #active though? or abandoned?

sanjitpal09 commented 7 years ago

Is this Project still active? It shows open from outside.

werdnanoslen commented 7 years ago

@sanjitpal09 I haven't heard any updates lately, but since the needs leadership tag is on this issue, you are free to take this idea where you're interested in taking it! Any ideas?

mateoclarke commented 7 years ago

Happy to talk about it @sanjitpal09. @werdnanoslen is right, this project needs leadership but we already started a repo with some issues for anyone can jump into. @mudspringhiker played around some with the data.

https://github.com/open-austin/water-quality/issues

sanjitpal09 commented 7 years ago

I can create dashboards using Tableau would that help.

mateoclarke commented 7 years ago

Yeah that sounds cool! Let's continue the discussion over in https://github.com/open-austin/water-quality/issues/4

werdnanoslen commented 7 years ago

Houston / Sketch City has a similar project https://github.com/sketch-city/project-ideas/issues/131

mudspringhiker commented 7 years ago

I've created a starter app that scrapes recent E. coli readings from the dataset. To forecast E. coli levels, I'm thinking of plotting past years' result for comparison but it may not be that easy. Thanks for the link. App is at https://immense-hamlet-63542.herokuapp.com.

tomschenkjr commented 5 years ago

Found this issue through Sketch City discussion... @ChiHackNight and @Chicago teamed-up on a paper to predict fecal-indicator bacteria levels. There is a lot of literature on this and there is some consensus that yesterday's results don't predict today's results very well.

We* just published a paper** on a new approach with about a 300% increase in sensitivity, without sacrificing other measures of fit. The introduction also paints a picture on some of the problems in existing literature.

Happy to discuss further on this approach. The code is open source.