iiasa / message-ix-models

Tools for the MESSAGEix-GLOBIOM family of models
https://docs.messageix.org/models
Apache License 2.0
17 stars 33 forks source link

Integrate R scripts #40

Open khaeru opened 2 years ago

khaeru commented 2 years ago

message_data contains some modules that are partly or entirely in R. For instance:

This issue is to discuss approaches so that this code (a) can be integrated into complete workflows that run unsupervised, and (b) can be more reusable, in a standard way. The implementation should be in message-ix-models.

Some ideas:

  1. Provide a CLI command like mix-models r-script foo/bar/baz that will simply invoke a file at (e.g.) message_data/foo/bar/baz.R, while providing some standard environment variables that the script can use to understand paths to data, etc.
  2. Use rpy2 (with documentation & demo code) in Python code to directly call functions from R code in particular files, and retrieve its output for further processing.
khaeru commented 2 years ago

@jkikstra @Jihoon @adrivinca @awais307 would appreciate some comment here on things like:

jkikstra commented 2 years ago

@khaeru, thanks for creating this issue to start discussions - from a first look i think at least option 2 could be really useful, option 1 maybe too but i don't know enough to give a good judgement there.

Detailed input from my side will have to wait a bit until after I return from holidays (probably will only get to it around 14 January).

In general, I was thinking until data becomes public, in message_data I would want to try to select the most useful Rscripts from DLE packages and integrate them in a DLE workflow that uses rpy2. For instance, I'm imagining that a "build" command could first run basic message, then use Rscripts to create a DLE scenario based on that, and then do a MESSAGE-DLE run after that.

More to follow in the new year.

adrivinca commented 2 years ago

In the nexus work, we have some R scripts that need to be run to process raw data into data then used by other python scripts. Since these R scripts need to be run just once -and sometimes link to large spatial data on the P drive- we do not include or call them from any python script. A user could just run them to generate new scenario configurations (SSP, SDGs), but otherwise all the output data of those scripts are already included in the message_data/data folder.