Same data, very different Gross Savings

mcwp commented 8 years ago

Using upload_dataset.py on Feb 21, I loaded fake data provided by Phil and ran the 0.2.13 version of the meter, resulting in this Gross Savings graph. I saved a copy of the database. feb22 If I restore that copy of the database and run the 0.3.11 meter, the client gross savings graph looks very wrong to me. Is this expected?

apr5 If needed, I could probably create smaller fake data and recreate this scenario, but I don't want to do that work unless it is needed. Note that while the graphs show only electricity, the consumption data does include natural gas values for the same years as the electricity values.

philngo commented 8 years ago

There's one bit in this graph that indicates that at least something is staying the same - it looks like pre-2103 savings values are roughly the same, but squashed down to the axis in the first graph by the presence of higher values, which makes me wonder if maybe the issue is on oeem-client, and not the meter? Clearly the post-2012 part of the graph is different. If you still have both versions available, would you mind directly comparing raw values stored in the MeterRun model on the datastore and post those values here? If they are the same, we should move this issue over to oeem-client.

mcwp commented 8 years ago

Yes, I can easily reproduce (I have the saved db, so I just restore it, look at the values, run the meter again, and look at the values again) and may have time to do so today, else Friday.

mcwp commented 8 years ago

For 10 projects, I printed sx.updated.date(), sx.annual_usage_baseline, sx.annual_usage_reporting, sx.annual_savings and they confirm that those fields are identical.

Project a949bf61-0edb-47a7-8fa4-7d71b1f3b0e2
2016-04-08 8218.41438064 4931.25468545 3287.15969519
2016-04-08 254.887659021 180.070900524 74.8167584967
2016-02-22 8218.3901794 4931.24693277 3287.14324663
2016-02-22 254.885805665 180.070833852 74.814971813

which may confirm your theory but are there other fields you'd like to see?

philngo commented 8 years ago

That definitely seems to suggest the problem is downstream somewhere. Given that those fields are identical (or very nearly so), it's very likely that the origin of this issue is on the datastore or the client rather than in the eemeter package - I'll move it over there when we've figured out the origin of the error. Here are a few places we can look for the origin of the inconsistency:

Potential problems in the datastore: maybe the datastore

isn't returning all relevant meter run monthly summaries properly on a call like /api/v1/projects/?with_monthly_summaries=True
is calculating the monthly summaries incorrectly

Potential problems in the client: maybe the client

isn't asking for the right monthly summaries
- isn't correctly aggregating monthly summaries

@mcwp If you make an API call (with the header Authorization: Bearer YOURTOKEN) to the endpoint /api/v1/projects/?with_monthly_summaries=True on the datastore maybe that is a good place to start tracking down the error - that will give us a hint about whether the data is corrupted in the datastore or not.

mcwp commented 8 years ago

I suggest this bug be closed as not a bug because it is not characterized adequately, and I cannot reproduce it. If you look at the scale of the two graphs, they are clearly not the same data. One tops out at less than 150k and the other at 800k. I believe the 800k graph has retail store data in it, and when I exclude that data, then both graphs (with only the fake-200 projects) look the same, with the red.

I do have an older snapshot of a correct looking graph with the axis topping out at 150k, but I do not have that database anymore and now that I've looked at the raw data more closely, I'm reluctant to spend the time reproducing it. Here is the fake consumption data, but note that there are many fewer than 200 projects in the early years. fake-consumption

I think I need to start over with a working example that is closer to what I need to do with the three retail stores' actual meter data.

impactlab / eemeter

Same data, very different Gross Savings #122