Closed bpbond closed 6 years ago
Hi @pralitp @d3y419 -
This and modeltime
are the two areas already done, and because they're small, they're handy test cases. I'm assigning (if I may) this to you guys as the DSR XML lead and socioeconomics lead, respectively.
We'd like to declare them officially done for Version 1: producing exactly the same XML files as the old data system. (Not exactly: we have made some changes, there's numerical precision issues, etc.) I'm hoping that Pralit has time to check the XML files, and work with Haewon to resolve any problems, for example if Russell or someone needs to go back and fix something. OK?
Mostly I've been working on a script to try to automate checks so at this point I may have bugs in there but thus far it has only flagged: bld_agg_GCAM3.xml not found in DSR bld_agg_gSSP3.xml not found in DSR bld_agg_gSSP5.xml not found in DSR socioeconomics_GCAM3.xml not found in DSR trn_agg_GCAM3.xml not found in DSR trn_agg_gSSP3.xml not found in DSR trn_agg_gSSP5.xml not found in DSR
@d3y419, you know of any reason those should be missing?
@pralitp, people wanted GCAM3 deprecated : https://github.com/JGCRI/gcamdata/issues/337
I've updated the list of files to reflect the final outputs of the old data system.
Unfortunately, we can't assume that file contents match on the basis of the old-new test because the old-new test will let fairly large differences through. For example:
At: [('scenario', ''), ('world', ''), ('region', 'EU-15'), ('GDP', '')]
Left:
laborproductivity
['year']
['2010']
0.0035
Right:
laborproductivity
['year']
['2010']
0.0034
That's a difference of almost 3%, which is too high, IMO. We may need to revisit our comparison criteria. We also need to have someone take a look at where these discrepancies are coming from. I can prepare a list of the discrepancies, but we'll need a volunteer to track down the discrepancies.
I will volunteer. I've been allocated a few hours for DSR and the only other task I have outstanding is getting the XML conversion working on Windows which at this point I know what to do (just means degraded performance).
Thanks.
Sorry for the delay; I had a telecon.
Here's the current list of comparison failures from this group: (ugh, that was longer than I expected. Let me attach it instead...) xmlcmp2.log
I'm also going to do some upgrades to the comparison tool to make these tests a little easier to work with.
With the upgrades to the comparison tool, some, but not all of these discrepancies go away.
Many of the remaining ones seem to be due to differences in output conventions. E.g., this one from trn_agg_gSSP1.xml
:
Old:
<income-elasticity year="2085">0.006</income-elasticity>
New:
<income-elasticity year="2085">0.00572147499629786</income-elasticity>
The old code clearly just rounded to three decimal places, while the new code outputs the raw value. Since "three decimal places" is one significant figure in this case, I'd say the old behavior is arguably wrong, and we should maybe prefer the new.
Whatever we decide, I can't think of a good way to detect situations like this automatically, so we will probably just have to have somebody track down each of these cases and verify that it's ok.
Yup, that is exactly what is going on.
In terms of it being wrong.. Maybe, or it is just arbitrary. This value was just some read in assumption value which was getting linearly interpolated over non-specified years. I don't think in this particular case the new (reporting full digits) is adding are real benefit as it is purely an assumption at the end of the day.
If we do ultimately decide we like the New better then we should wrap it in OLD_DATA_SYSTEM_BEHAVIOR
.
I don't have an opinion one way or the other. I think this is @d3y419's call.
Have we patched the code so that we now match the old output? If so, then we should be able to check off everything except the temp-data-inject item and the GCAM equivalence item.
And with that, I believe my work here is done. @d3y419, the rest of these validation steps are for you to take care of, or delegate as you see fit.
Yup, and in that last merge I also fixed the last of the temp-data-inject
so all that is left is verifying these XML run and reproduce in GCAM.
New data system socioeconomics xmls are identical to the old datasystem xmls.
run-xml-tests.py ../old-xml/socioeconomics-xml ../new-xml
34 files tested.
0 missing output files.
0 file discrepancies.
🎉
34 files are a lot more than the 3 in modeltime
so let's run GCAM with these please, just verifying it executes. And then DONE! @d3y419 you score--looks like nothing to do in socioeconomics
socioeconomics old-new xml test passes!
python3 run-xml-tests.py ../old-xml/socioeconomics-xml ../new-xml
34 files tested.
0 missing output files.
0 file discrepancies.
"Folks, welcome to the game here on WPXY. If you're just joining us, Kalyn Dorheim has gone 2 for 2 so far at the plate...she's on fire today, swatting 'em out of the park like it's child's play. Can't wait to see her next move."
🎉
GCAM 5 reference runs without error with the socioeconomic xmls from the new data system!!
@bpbond this issue is ready to be closed. :+1:
Spectacular. Thanks.
Verify that
[x] new data system produces exactly same
socioeconomics
XML files as old data system[x] structure of files matches
[x] no
temp-data-inject/
files still being used[x] contents of files matches ~(we assume true for data, based on
oldnew
test)~ verified with run-xml-tests.py[x] GCAM runs with these files substituted in