California-Data-Collaborative / OWRS-Analysis

Analysis of water rates collected in the OWRS format.
3 stars 6 forks source link

initial plots #11

Closed victorsette closed 6 years ago

patwater commented 6 years ago

@victorsette where can one see the plots?

victorsette commented 6 years ago

Apparently github doesn't render R notebooks... So I added a "plots" folder to the repo and saved the plots there for now: https://github.com/California-Data-Collaborative/OWRS-Analysis/tree/rate_vs_efficiency/plots

christophertull commented 6 years ago

Looking good so far. One tweak that would be good though, is I would probably just pick either a particular month (e.g. July) or average over the months rather than looking at the whole 12 month time span.

patwater commented 6 years ago

@victorsette what are the colors? Utility names?

On the percent fixed what is that looking at? Is percent fixed from the Survey Monkey? (The x axis title is a bit confusing)

Also are we updating this chart?

image

CC the water rate brain trust @datwater @jcruz: https://github.com/California-Data-Collaborative/OWRS-Analysis/tree/rate_vs_efficiency/plots

christophertull commented 6 years ago

Also would be useful to pull the code that generates each plot of into its own function as can be seen in the plots.R file

christophertull commented 6 years ago

Also probably helpful to overlay a simple regression line on the scatter plots after reducing to monthly average.

victorsette commented 6 years ago

No problem @christophertull , will average it over the 12 months for now then. At first, I was just thinking of having a plot where we could easily visualize both time series (of efficiency and rates) for any given agency (will still do it if I can). And yes, will also them to the plots.R file once they're more definitive

Yes @patwater, the colors are supposed to be different utility names. But I think we might be running out of colors and having to repeat them for different utilities. For the , what it represents is the percentage of a bill that is due to a <service(meter)_charge> and not from the or other fees. And that was calculated for a of 15 CCF for all utilities all months.

For the plot you pasted above, @christophertull had already done a similar one... I can do some changes to it and format it like that to have an update of it if you guys think that's a good one to have. I'll also add the image to the plots folder then.

Should I add all the previous plots to the plots folder @christophertull ?

patwater commented 6 years ago

@victorsette this image still looks the same?

image

victorsette commented 6 years ago

Yes @patwater , it's still about the same... image

Are you wondering which are the outliers?

I didn't look through them yet to see if there are really no mistakes, but these are the utilities with total bill above $200 for 15 CCF for now:

utility_name effective_date bill Glenbrook Water Cooperative 2016-1-1 1400.00 City of Tehama 2017-07-01 525.00 Valley Estates Properties Owners Association 2017-01-01 440.00 Westhaven Community Services District 2017-07-01 260.45 Lamont Public Utility District 2017-01-01 257.86 California Water Service Company Kern River Valley 2017-01-01 257.40 Burlingame City Of 2017-01-01 200.68

patwater commented 6 years ago

Yeah I don't see why that plot is a time series. What's report referenced date? Is that the effective date of the rate structure? Also generally the graph is a bit hard to parse with a ton of colored lines.

Wow also what's up with Glenbrook? Hmmm note their charge is billed annually: https://github.com/California-Data-Collaborative/Open-Water-Rate-Specification/blob/master/full_utility_rates/California/Glenbrook%20Water%20Cooperative%20-%203299/1-1-2016.owrs

Seems that's not getting captured here @christophertull

christophertull commented 6 years ago

Good catch! I opened an issue to fix this (#12)