Open angelareplica opened 6 years ago
Hi Angela,
thanks for this service to the public :)
A lil update:
Nope
Hmm... Will keep working on the charts above to clean them up & annotate. Not sure what else to do with this data or what to visualize, though! Will definitely consider the feedback above.
I don't understand what is the meaning of each colour. What is MPN? I don't like the Upper case letters at the axis. I think you should change them to lower case. Also the title should be the same size on your graphs! Also, is 100 enterococci per sample a problem? Is it too much? Some annotation might help.
Great topic! Definitely one that I prefer to not think about, but hey, we need to know, right?
I like how you've grouped the counts together on your bar chart. It might be helpful to add dates there, so the reader can get a sense of how often those beaches are filled with shit. Is the data just for one year?
I'd also be curious as to how often are samples being taken at these beaches. We've all heard that we shouldn't go to the beach after it's rained a lot, so it would be interesting to see if the city avoids taking samples after rain storms--DarkSky has historical data in their API, right? It would be a bit of work, but it could add an interesting element to the project if you find yourself with a lot of free time.
Visually, I love your second chart and the colors that you chose, but I think the colors can be a little confusing. Maybe it would help to put some space between the Rockaways and Coney Island to show that the colors represent two different locations? Also, here's the pandas documentation to change words to title case--it's super quick.
I'm really interested in how many times the enterococci count is juuuuuust under what it would have to be to close a beach to the public. I feel like if there's a big weekend (Fourth of July) or a lot of pressure from the public (like if a triathlon is going on), officials might be tempted to fudge their numbers a little bit to avoid public outcry.
I made a heatmap showing average fecal bacteria count by month in 2018 to supplement the charts above (which I still need to make prettier). It looks terrible at the moment.
Including July (with the updated dataset I mention below):
Nope.
The Department of Health & Mental Hygiene just updated their data-set so that it includes July data. I have decided to re-do everything -- all of the above charts. I am sad (mostly about the extra Illustrator work). I hope it's worth it!
Hey! I loved your idea and it's definitely something New Yorkers would be super interested in knowing more about. The heatmap looks really good and it's really easy to read. Maybe you can add some annotations to it for the really dark values. Did something unusual happen at that beach on that month to make the numbers go up that much?
I love the color scheme you used on our heatmap, I think it really fits the topic. Maybe you can apply the same colors to your other graphs.
In your second graph, the one about the samples, it's not so clear what each dot means. Are they samples? You might want to add a little legend explaining it. Also, it could be cool to make "The Rockaways" and "Coney Island" have the same text color as the dots representing them. That would make the graph even clearer.
Great job! I'm looking forward to seeing the next version! 😄
um wow. never seen this dataset.
so i'd like a map here even if it's just to show me the beaches. and i'd like to see the heat map divided up by place/beach the way you did at the top. then organized most to least. so coney island, rockaway best to worst.
or you could do some small versions of the heat map and sort it different ways, and give us a line of text over each so we see what it is. sort it by the worst june, july and august.
organize it by beaches, best to worst. lots of ways here to play with this in small multiples or to show us different things.
and the numbers on the bottom, should be months yes?
Douglaston Homeowners Association:
Headline: These NYC Beaches are Literally Full of Human Shit Alternate headline: When Fecal Bacteria Colonizes New York City's Beach Water
Published website version: https://angelareplica.github.io/ds-nyc-beaches/
Code repository: https://github.com/angelareplica/data-studio/tree/master/code/03-nyc-shit-beaches
Final data set(s): https://github.com/angelareplica/data-studio/tree/master/code/03-nyc-shit-beaches
Finding things to focus on, and interpreting the data.
I'm satisfied, but my heatmap needs a lot of work. I'm still trying to wrap my mind around Seaborn, but I might take another stab at it later on. I pitched this somewhere, but before I cleaned up my graphs and did more analysis... so we'll see how that goes.
Pitch
Summary
NYC's Department of Health & Mental Hygiene makes regular updates to their data set on beach water quality. In particular, they test their samples for enterococci, AKA indicators of the presence of human waste/fecal material (and possible disease-causing bacteria as a result). I'd like to know which nast ass NYC beaches to avoid this summer (and also the rest of my life).
Details
Possible headline(s): Which Shit-Encrusted NYC Beach Should You Swim at This Summer? Staten Island's Beaches Are Literally Filled With Human Shit (Sorry)
Data set(s): https://data.cityofnewyork.us/Health/DOHMH-Beach-Water-Quality-Data/2xir-kwzz
Code repository: https://github.com/angelareplica/data-studio/tree/master/code/03-nyc-shit-beaches
Possible problems/fears/questions: Finding the best way to visualize and convey the data.
Work so far
After acquainting myself with the data set, I looked up EPA recommendations for marine water recreation, as well as NY state and city health/sanitation code.
Made a chart of all the 2018 samples that have fecal bacteria counts exceeding the New York State Sanitary Code and the NYC Health Code for marine water. (This looks terrible right now, but I'll clean this up and annotate in Illustrator for my next revision.)
I also wanted to look at some of the NYC's most popular beaches, like Coney Island and the Rockaways. I made a rudimentary bar chart showing the average counts for samples collected in 2018. Trying to figure out a better way to visualize this. (Averages don't seem ideal, since samples can differ drastically -- and a regular bar chart looked bad, since a number of samples had counts of 0 -- or below the detection limit.) Will also be cleaning this one up in Illustrator for my next revision.
Checklist
This checklist must be completed before you submit your draft.