California-Data-Collaborative / CA-Stormwater-Data-Challenge

Visualize potential dry-weather runoff contributing areas to identify prioritization for areas to target
15 stars 11 forks source link

Conveying Water Use Data #5

Open amandaaprahamian opened 7 years ago

amandaaprahamian commented 7 years ago

Great job so far on the Urban Drool Tool!

Attached is a rough mock-up of what the layout of the Urban Drool Tool could look like. However the layout turns out, the basic data to convey is (1) spatially: where the census block boundaries, outfalls, and outfall drainage areas are and (2) with text: what is going on (water use, competitions, savings, and outfall info) in the census block the user is in.

urbandrooltool

amandaaprahamian commented 7 years ago

How do we want to use the water use data? It might be difficult for someone to find meaning in a large water use number if their census block is large. Maybe we could do a per-capita water use number? This number would be added to the side bar that populates with text on the Urban Drool Tool.

patwater commented 7 years ago

Per capita is a great idea!

leighphan commented 7 years ago

@amandaaprahamian 👍 to a per-capita water use number. Would anyone at CADC or MNWD be able to provide the data so it's readily available for input for CodeLab?

monobina commented 7 years ago

@leighphan @amandaaprahamian we were thinking it might be good to associate inefficient usage with more runoff. Inefficient usage would be Water Usage > Water Budget. I agree that it would make more sense to have per capita usage and kind of compare it with the per capita water budget maybe?

monobina commented 7 years ago

I think the water usage in the data is currently total for the census block. Not sure. @patwater will you be able to confirm what's the current unit of the usage data - total for census block or per capita?

christophertull commented 7 years ago

Pretty sure the usage is total use. But if the numbers you have are in the single or double digits then you probably have per capita use.

Also the question of what to display is a good one. Will probably need to explore a bit to see if anything correlates with runoff.

datwater commented 7 years ago

Total water use might be misleading as runoff is caused with over-irrigation. For example, a household might have a large lot and irrigate a lot more than someone with a small lot but the small lot might be applying more than what is needed (hence small lot is causing runoff). Per capita will also bias towards less dense areas. The key for identifying waste/ dry year runoff is over irrigation which requires both usage and comparing that to the water budget. See here for an explanation of how it's calculated: https://www.mnwd.com/understandingwaterbudget/

The two metrics I think that make sense are: 1) total usage over budget summed up across all customers in a census block 2) weighted average percentage of budget ie: (usage_1^2/waterbudget_1 + usage_2^2/waterbudget_2 ) / (usage_1 + usage_2)

The first shows the quantity that is potentially running off. The 2nd shows the average efficiency. For the 2nd, it would probably also help to understand the variance/ distribution and add that if possible.

monobina commented 7 years ago

I agree with @datwater on the metrics for measuring inefficiency. Per capita might be biased and total usage might be due to indoor use also. MNWD can help with setting up this metrics in the data or maybe CADC @patwater @christophertull can build these metrics with their cleaned up data ?

patwater commented 7 years ago

We're highly booked at the moment so would be best if came from MNWD :)

@datwater what's the logic with raising usage to the second power in your weighted average out of curiousity?

monobina commented 7 years ago

Sure we can help with the data. Might have some questions for you in the process .

datwater commented 7 years ago

Its the formula to weight by usage: usage_1/waterbudget_1 is the efficiency of customer 1. If I want to take the weighted average efficiency I multiple each respective customer by their usage and divide by the total usage.

datwater commented 7 years ago

What we need from CaDC is the cleaned data linked to census block groups. Is there a way you can give Monobina access to MNWD data that you have cleaned? She can develop the formula to apply key thing we need is the census block for each account.

patwater commented 7 years ago

@christophertull thoughts? Correct me if I'm wrong though that should be a simple export.

christophertull commented 7 years ago

Yeah should be simple. Do you want customer data with census block? Or data pre-aggregated to census blocks?

On Mar 14, 2017 14:02, "Patrick Atwater" notifications@github.com wrote:

@christophertull https://github.com/christophertull thoughts? Correct me if I'm wrong though that should be a simple export.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/California-Data-Collaborative/CA-Stormwater-Data-Challenge/issues/5#issuecomment-286559467, or mute the thread https://github.com/notifications/unsubscribe-auth/AIpEvSY-XksWZzJX6-BuHmB6resAPGQqks5rlwBsgaJpZM4MbUZp .

monobina commented 7 years ago

I think customer data with census blocks should work fine. I can then aggregate it how Drew suggested above.

monobina commented 7 years ago

@leighphan @amandaaprahamian we have the weighted average efficiency and total usage over budget by census block in the data now. Where do I share the data with you? Should I upload it in this repo?

leighphan commented 7 years ago

@monobina @amandaaprahamian Sure! Just curious, what is the time-frame for this data?

Could you upload the data to this original repo? Thanks!

monobina commented 7 years ago

@leighphan the data starts from Jan,2011 and is upto Aug,2016. Looks like i do not have push access to this repo. @christophertull @patwater can you please set me up with the push access to this repo so that I can upload the data? Thanks!

christophertull commented 7 years ago

@monobina done!

monobina commented 7 years ago

Thanks @christophertull !

monobina commented 7 years ago

@leighphan @amandaaprahamian I have uploaded the data in this repo in the Data folder. There are two csv files - TotalInefficiency and WeightedAverageEfficiency. TotalInefficiency provides the usage over budget (i.e. inefficient usage) by census block , year and month. The WeightedAverageEfficiency provides the average efficiency by census block. It is calculated using this formulae below which provides the weighted average of efficient usage in a census block (usage_1^2/waterbudget_1 + usage_2^2/waterbudget_2 ) / (usage_1 + usage_2)

monobina commented 7 years ago

It will be great to have all your thoughts on Issue#6 "Using Inefficiency Data".

patwater commented 7 years ago

Added some thoughts in that Issue :)