iiasa / message_ix

The integrated assessment and energy systems model MESSAGEix
https://docs.messageix.org
Apache License 2.0
123 stars 154 forks source link

Zero values in GDX #516

Open behnam-zakeri opened 3 years ago

behnam-zakeri commented 3 years ago

Due to efficiency considerations in GAMS for storing sparse data, zero values in GDX files are ignored by default when loading data of a model. It seems for overcoming this issue, there are some mapping sets in message_ix, like is_bound_emission or is_relation_upper, to pass the information that cannot be loaded from GDX files, i.e., zero values in parameters when compiling a model. This workaround has a few drawbacks:

  1. These mapping sets increase the number of model items and store some space even maybe small.
  2. There needs to be instructions for users on how and where add these mapping sets, when the user aims to add new parameters to the model as requested in #514.
  3. Not loading zero values in some of existing parameters without such mapping sets can be problematic as reported in #515. And defining such mapping sets for many parameters may not be desirable (see No. 1).
  4. The issue of not displaying zero values in the output GDX file is not resolved by this workaround. This creates confusion for some postprocessing calculations, where zero values are different from missing values (e.g., prices). See #306 for example.

It seems GAMS has an option to convert zero values to Eps when loading from GDX and compiling a model (see $OnEps and $OffEps). This can potentially resolve the issues 1-3 listed above, but not 4. Are there any other plans to tackle this in ixmp or GAMS?

khaeru commented 3 years ago

This is a good summary of the challenges here. The GAMS documentation describes Eps as "a stored zero value". Where literal 0 but not Eps is given, GAMS interprets that as "missing/no data".

I agree that the handling of this distinction could be simplified, and that might reduce the need for the "mapping sets" ("mask" is another term used for this kind of structure/data). However, since the GDX I/O is currently buried in the JDBCBackend/Java code, we are unlikely to modify/improve it per se. A pure-Python backend is in development (@danielhuppmann @meksor) so (although I think the scope there is already very large) this could be considered as an additional requirement, i.e. that actual zero values and missing values are distinguished.

Before doing that it would be good to, as you suggest, experiment with adding $OnEps to the MESSAGE GAMS code. If this can reduce the things that the Python code must do, that would be a big help.

behnam-zakeri commented 3 years ago

Thanks @khaeru for following this. We had a discussion with one of the colleagues, and he believed there may be some issues in scaling the problem in optimization, if the zero values would be treated as Eps. However, neither of us has tested this in GAMS. So, it may be indeed worth to test this, and make sure this works. Otherwise, finding a solution on the python side seems more flexible.