Open tsp2123 opened 6 years ago
Hello! I'm a little robot, let's take a peek.
Please post your first revision! It should be posted by Thursday at midnight. More details available here.
You need some feedback, let me summon @SimoneLuc, @SiruiZhu, @jessimckenzi for you
It looks like we need to fix up your pitch a little bit! Edit it by clicking the pencil in the top right-hand corner. It requires:
Interesting data! I'm unclear what's being counted here, but 13 violations doesn't seem like that many more than 5, without context. As for the styles, the colors are really aggressive, and it appears as though you're trying to draw attention to the red ones, even though it doesn't look like those are special in any way. And what's the 3 in the second graph? These are (apparently) very small numbers you're working with—why are they notable?
I like this idea!! My thoughts would be:
Sorry about the late post. Here's some updates
I realized my initial analysis wasn't as in depth and I'm still in an exploratory phase. My next steps are to look deeper into the biggest violators by company
I am having an initial problem when I want to make a bar graph of violations by year. For whatever reason, the graph decides to sort itself whereas I need the X axis to be chronological
Here are some revisions to my old graphs.
I've changed some of the aesthetics of this graph. I'm still getting a hang of illustrator and trying to fit things into an art board template
I've been trying to find ways to dig deeper into the companies but I'm not confident in the dataset's collection. For example the info links dont like to much more information that what's already present in the dataset and there are no links to any PACER case files so while you can tell the number of violations and the addresses at which these violations happen you cant tell much else about it : (
The other question I've reached out for an explanation—but related to the lack of data clarity—I'm not sure how Wage and Hour violations are different from Labour Violations and whether that's even a worthy distinction for my audience. If I consider them together, it would change my dataset greatly, and I'm not sure whether that's worth doing.
if you have a chart that has a scale you don't also need numbers on it. and watch out for repeating something, 'violations' should be a label once, not repeated.
i find this dataset problematic. partly because it's from a watchdog group, and that makes me wonder what is their agenda. but partly because the numbers are so small.
the fact that certain companies have more violations than others, some are bigger than others and handle more clients.
these data are not strong enough to make charts out of. to be honest, i would scrap the data entirely. once you harden something into a chart it makes it appear to be fact and in this case, its misleading. your instinct to not be confident about the numbers, go with it.
I was confused about the graph Healthcare violations by year'. The subtitle mentioned that they have been increasing every year, however, we could see some periods where they started to lower, especially after 2015. I approve that you changed that subtitle, but I encouraged you to use annotations to explain what was happening in those periods. Yes, that is going to be another research, but a quite revealing one, since your data is not that broad (specify the source in the graph). It is great that you decided to use just one color for the cumulative violations, the first version was quite confusing. However, I do not think that the bar graph is the best one to use here. Maybe pie? I know Soma does not like them, but in this topic, I think they will communicate faster and punctually. To compare companies, it would be nice to have more information about their available resources and size. This way we can compare more accurately.
I would love it to see what a "False Claims Act" actually is.
On one of your final charts the term "violations" is mentioned 12 times. I would use it just in the headline and as an axis-label. Instead of rotating the years (x-axis), maybe it would make sense to have fewer years on there?
Also, since we're talking about 140 violations in 2016, my immediate questions are:
Meaning: I would need the numbers to be explained.
Here are my updated visuals! And here's the link to what I put in html. (For whatever reasons these aren't uploading but hopefully the hyperlink works and you can view them there)
Headline: Don't Put Grandma in a Nursing Home
Published website version:
https://tsp2123.github.io/projects.io/Project/html%20project/index.html
Code repository: https://github.com/tsp2123/data-studio/tree/master/Project_2 Final data set(s): ''''
This dataset was difficult to produce a data story out of—I think it's one of those sets that is useful for exploratory analysis, for example, the fact that there is a shit ton of issues with nursing homes seems to be replicated in this dataset and reflects the more anecdotal news of nursing home's being rife with violations, but the dataset doesn't take into consideration other important factors about the business involved such as the size of the business, its market share, etc. So as Sarah pointed out it isn't conducive to the best results and may lead to certain outcomes that aren't entirely reflective of the situation.
I'm not to keen on this project. Again, it's really just a test project for me querying datasets, but other than that this isn't really for publication. It's simply a test.
Hey! Congratulations on completing your project! I took a look at your website and I would suggest to adjust the scale a little bit. I figured that you are using the size A for three of your graphs, and I feel like this might not be the perfect size for your charts since I can't read the labelling very clearly, and the font size would better be standardized to create a coherent look.
A minor suggestion on your first graph: maybe you can make your x-axis tick size smaller :)
Pitch
Summary
I found a dataset of industry violations tracked by Watchdog group Good Jobs First. I'm looking at the Healthcare Industry to ask the following questions:
Which company has the most violations?
Which year did these violations occur?
Do some US States have more violations than others?
What was the primary reason for these violation?
Details
Possible headline(s):
Data set(s): https://violationtracker.goodjobsfirst.org/prog.php?major_industry_sum=healthcare+services
Code repository:
Possible problems/fears/questions:
Work so far
Here we can see that certain companies have excessively more violations than others:
The Top Two seem to be Kaiser Permanente and American Medical Response. I'm still trying to work my data, but the following graph shows me trying count how many violations per year. It's not working out right now with the dataset sorting itself for whatever reason.
Checklist
This checklist must be completed before you submit your draft.