catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
481 stars 110 forks source link

Allow dollar values to be adjusted for inflation on export #139

Open zaneselvans opened 6 years ago

zaneselvans commented 6 years ago

We have decades worth of data, all reported in nominal dollars. This isn't super useful for inter-year cost comparisons. Utilities & Fuel inflation rates are reported to the St. Louis Fed, and are available via their API on a monthly basis as far back as 1952.

See here: https://fred.stlouisfed.org/series/CUUR0000SAH2

We should at least have a generalized function that allows you to specify what month or year's dollars you want the costs expressed in, and transform the cost columns in a dataframe to be in terms of that year. We may want to consider transforming all historical data to present day dollars on load. Might want to integrate pulling the FRED data into our datastore, if we will be dependent on it for future analyses.

karldw commented 6 years ago

@zaneselvans, if you want unsolicited advice, it might be better to use the St. Louis Fed's general inflation rate. The utilities and fuel index you mentioned is what consumers pay, which will be some mix of overall inflation, oil market shifts, and changes in the prices utilities are allowed to charge their customers.

zaneselvans commented 6 years ago

Oooh, I think this might be the first unsolicited advice we've gotten on the project, so it's exciting! I wonder if maybe the Electric Power Generation Producer Price Index might be more appropriate?

https://fred.stlouisfed.org/series/PCU2211102211104

Though it only seems to go back to Dec 2003 -- but as of now we don't have any data series integrated further back than Jan 2004, so that would actually work fine. And maybe we splice it with the broad CPI for earlier years?

karldw commented 6 years ago

Interesting! I need to think a little more about this.

karldw commented 6 years ago

Talking this over with people, we landed on "it depends".

Version A:

These [price indices] answer different questions "what change in expenses is because of change in prices?" vs "what could the expenses have otherwise bought?"

Version B:

"What is the size of the industry (or firm or whatever) over time if input prices are held fixed?" vs. "what is the size of the industry over time if the size of the total economy is held fixed?" And it seems like it would be reasonable to use other indices, like the CPI, to answer other questions.

zaneselvans commented 6 years ago

Oh these are interesting questions, and I could see wanting to be able to use either one of them -- this makes me think that we should allow the inflation index to be provided on the fly, in the layer between the database and output, so the user or application developer can choose how to adjust things.

zaneselvans commented 5 years ago

We've generally decided to keep the PUDL data minimally altered, and as explored above, it seems like there's more than one way to go about doing inflation, so it seems like maybe it will be better to (someday) allow the user to select an inflation index of their choosing, and apply it to that original data, which probably makes more sense to have happen at export i.e. output -- rather than import.