ecwood / GCAM-CDR-modeling

2 stars 0 forks source link

Documenting Progress for Understanding the GCAM Model So Far #4

Open ecwood opened 2 years ago

ecwood commented 2 years ago

From the documentation, what I can see is that you can edit the CSVs and update the R files so that the built XML files contain that new data. (I learned this from navigating from here to this video). For example, I saw that A62.calibration.csv is an input to zchunk_LA162.dac.R. Next, by searching for specific values in that CSV, I found that zchunk_LA162.dac.R contributes to dac_ssp1.xml, dac_ssp2.xml, dac_ssp3.xml, dac_ssp4.xml, and dac_ssp5.xml. A few outstanding questions I have from this are: (1) How do multiple R files create the same XML file? (2) Where is this systematically documented? (i.e. which scripts create which files) (3) How do I compile this, using the command line (due to AWS reasons), when those multiple scripts are involved? (preferably without having to rebuild the entire system)

My next step is to watch this video: https://www.youtube.com/watch?v=S7vAShH-dbs, because it looks promising.

ecwood commented 2 years ago

From Fuhrman et al., 2021, I referenced the code here. It was not very helpful because while it included the XML files they used, it lacked the R scripts they used to create them (which is what I'd like to see for reference).

From Ou et al., 2021, I referenced the code here. This was challenging because I had to download it. I also struggled to tell if the differences were due to changes in version, since my repository had more finals than it, despite mine being unaltered.

I've also referenced the 2018 GCAM Tutorial and the 2019 GCAM Tutorial.

ecwood commented 2 years ago

One silver lining is that the Python package xmltodict, when inputting a GCAM XML file, then dumping it back out (though it goes through it's own conversion within python, so that was what I was checking), produces GCAM-readable XML files. (And the tool is very easy to use)

ecwood commented 2 years ago

[Side note from video: Need to know to add tax on separately (model separates price and tax). Also, according to the video at about 1 hour and 52 minutes, the tax and the constraint method are functionally equivalent, as also noted on slide 40 of the tutorial linked below.]

Following my investigation into the build cycle, I decided to rerun make xml based on slide 45 of the 2015 GCAM Tutorial. I got this error:

ubuntu@ip-172-31-57-164:~/gcam-core$ make xml 
cd input/gcamdata && Rscript -e "devtools::load_all('.')" -e "driver(write_output=FALSE, write_xml=TRUE)"
Loading gcamdata
GCAM Data System v5.1
Found 420 chunks
Found 4265 chunk data requirements
Found 2416 chunk data products
1452 chunk data input(s) not accounted for
Chunks left: 1872
Error in parse_csv_header(., fqfn, header) : 
  'File:' given in header (A62.calibration.csv) doesn't match filename in inst/extdata/energy/A62.calibration_test.csv
Calls: driver ... %>% -> add_flags -> add_comments -> parse_csv_header
Execution halted
make: *** [Makefile:5: xml] Error 1
ecwood commented 2 years ago

As a followup to my previous comment, once I fixed that error by editing the first line of A62.calibration_test.csv and ran make xml, dac_ssp[1-5].xml were recompiled with that change. Unfortunately, so were hundreds of other files, which is very time consuming. My next step is to explore how to compile these without having to redo the full make xml (and without using a GUI-based tool, which isn't feasible on an instance).

ecwood commented 2 years ago

The documentation in this folder has been helpful, which I learned from re-watching that first video I watched. To speed up make xml, I am going to use their driver_drake() functionality (as documented here). My next step is to write a model modification script (based on one of their vignettes). I also plan to deploy changes to the setup script based on these findings.