Closed bpbond closed 2 years ago
In #454 @pkyle said:
Yeah I think there was a slight misunderstanding--in the old data system, I used the objects "model_base_years" and "model_future_years" to mean the years for GCAM, whereas the "historical_years" and "future_years" were just for processing data. Those were separated for a number of reasons (e.g., hindcasting runs, only running a small subset of the total available historical years, probably others too). The only reason we'd ever change the historical_years is if we got more data going further forward or back in history. I certainly never intended for people to come in and e.g. set historical_years to 1971:2006 and have that produce a similar or even functional dataset; I just never would have built it for the situation where we get less data over time, or where the max(historical_years) is anything less than 2010. Going forward, it's fine to maintain that capacity, but we will probably have to re-set a number of objects to a hard-wired 2010 year where the code currently tries to pull the latest historical year.
Page is exactly right. I think the sooner we fix this, the less painful it's going to be.
Hmm. Confusing. Here is _common/assumptions/A_common_data.R
:
#Historical years for data write-out
historical_years <- 1971:2010
Then aclu-data/assumptions/A_aglu_data.R
:
AGLU_historical_years <- 1971:2010
And then in A_modeltime_data.R
:
model_base_years <- c( 1975, 1990, 2005, 2010 )
model_future_years <- seq( 2015, 2100, 5 )
historical_years
? I guess the "...for data write-out" comment made me assume that it was tied to model_base_years
, but not at all?I'm not sure why there is a separate AGLU_historical_years
. Perhaps someone writing that section wanted to leave open the possibility that the AGLU raw data might have different historical years than the rest of the data? It deals with different data sets, so it's theoretically possible.
The "historical years for data write out" comment is clearly muddled, as we now know.
¯\_(ツ)_/¯
I think I'm also confused as to why switching historical years to 1971:2006
shouldn't work if the data is supposed to exist for 1971:2010
. Maybe it isn't important.
The reason for AGLU_historical_years being different is that in an earlier version of the data system, a number of the AGLU databases only went up to 2009. The "commodity balances" databases (whih used to be called Supply Utilization Accounts, which I've abbreviated SUA) tend to lag behind the PRODSTAT databases and others by 2-3 years. Right now, the AGLU historical years are the same, but in the future when we update our data, most datasets tend to lag 2 years before present, whereas the SUA data lag by 4.
model_base_years and model_future_years are just the years that will be run in GCAM. Totally different from the historical years for which data are processed, but the model_base_years should/must consist of historical years. The reasons why setting the historical years to 1971:2006 might trip things up are that (1) the model_base_years have a year, 2010, that is a calibration year but for which there will be no historical data, which will blow things up, and (2) a number of places in the code assume that the most recent historical year(s) can be used as a proxy for calculating something, and re-setting the historical years to terminate earlier might point to a year where the necessary data aren't available.
See e.g. #454