rangl-labs / netzerotc

MIT License
11 stars 2 forks source link

Minimal example of Excel serialisation and de-serialisation #35

Closed masonlr closed 2 years ago

masonlr commented 3 years ago

@jia-chenhua needs a minimal example of:

  1. Loading an Excel file to a python object via pycel
  2. Saving the Excel file to a file (serialisation)
  3. Loading the file to a python object (de-serialisation)

This minimal example will help to address the feasibility of obfuscating the original Excel sheet, which is discussed here https://github.com/rangl-labs/netzerotc/issues/31

masonlr commented 3 years ago

The working example is here: https://github.com/rangl-labs/netzerotc/tree/175e4566cdbf352decb50b16fbc17cf11e218774/environment/reference_environment_direct_deployment/hack

jia-chenhua commented 3 years ago

Hi @masonlr, please could you have a try of the following:

After running https://github.com/rangl-labs/netzerotc/blob/175e4566cdbf352decb50b16fbc17cf11e218774/environment/reference_environment_direct_deployment/hack/main.py#L34, simply run production_loaded.evaluate("GALE!P35"), you see the error message:

Traceback (most recent call last):

  File "<ipython-input-19-59c7d545f192>", line 1, in <module>
    production_loaded.evaluate("GALE!P35")

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelcompiler.py", line 857, in _evaluate_non_iterative
    self._gen_graph(address)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelcompiler.py", line 922, in _gen_graph
    self._make_cells(seed)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelcompiler.py", line 732, in _make_cells
    excel_data = self.excel.get_range(address)

AttributeError: 'NoneType' object has no attribute 'get_range'

The reason behind as I guess is that, after compiling using https://github.com/rangl-labs/netzerotc/blob/175e4566cdbf352decb50b16fbc17cf11e218774/environment/reference_environment_direct_deployment/hack/main.py#L22, if you skip the evaluation of https://github.com/rangl-labs/netzerotc/blob/175e4566cdbf352decb50b16fbc17cf11e218774/environment/reference_environment_direct_deployment/hack/main.py#L26 and directly serialize this object using https://github.com/rangl-labs/netzerotc/blob/175e4566cdbf352decb50b16fbc17cf11e218774/environment/reference_environment_direct_deployment/hack/main.py#L27, the serialized file won't contain anything meaningful (just look at the size of the .pkl file, which is too small to be capable of compressing the entire workbook to contain all its information of cells and formulae), and the .yml file is almost empty.

On the other hand, if you indeed run https://github.com/rangl-labs/netzerotc/blob/175e4566cdbf352decb50b16fbc17cf11e218774/environment/reference_environment_direct_deployment/hack/main.py#L26 and then https://github.com/rangl-labs/netzerotc/blob/175e4566cdbf352decb50b16fbc17cf11e218774/environment/reference_environment_direct_deployment/hack/main.py#L27, and open the saved production.yml, it should contain:

cycles: false excel_hash: 70106bfd25219e8ab3785869b5b24f40 cell_map: Vision 2035 Production!C4: 181.49034 filename: ../../compiled_workbook_objects/Pathways to Net Zero - Simplified.xlsx

In other words, when you run https://github.com/rangl-labs/netzerotc/blob/175e4566cdbf352decb50b16fbc17cf11e218774/environment/reference_environment_direct_deployment/hack/main.py#L26 to evaluate the cell https://github.com/rangl-labs/netzerotc/blob/175e4566cdbf352decb50b16fbc17cf11e218774/environment/reference_environment_direct_deployment/hack/main.py#L18, it will "initialize" the cell_map of Vision 2035 Production!C4, and this will be stored in the serialized files (.pkl and .yml as above).

Therefore, to get both the serialization and de-serialization work, after taking 20~30 seconds to compile the workbook and before serializing it to hard drive, we need to "initialize" all the cells involved in the env.py using the production_excel.evaluate(cell), where the cell has to be looped through all the cells involved in env.py.

However, several weeks ago, what I tried (but failed) is that, first I run the test_reference_environment_direct_deployment.py up to this line: https://github.com/rangl-labs/netzerotc/blob/a5a9bb8c0ad204598659e663c7e6b1564678f0d9/environment/test_reference_environment_direct_deployment.py#L48. Now the env.param.Pathways2Net0 is the compiled object and also all cells involved in env.py are "initialized" by either .set_value() or .evaluate() (also, they are after applying constant actions in the test_reference_environment_direct_deployment.py, so all empty cells from 2031 to 2050 in "GALE" and "Outputs" are filled). So theoretically, we should be able to serialize it by env.param.Pathways2Net0.to_file("production"), but it didn't work with error message:

Traceback (most recent call last):

  File "<ipython-input-13-56584f177794>", line 1, in <module>
    env.param.Pathways2Net0.to_file("production")

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelcompiler.py", line 331, in to_file
    text_changed = self._to_text(text_name, is_json=is_json)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelcompiler.py", line 216, in _to_text
    ymlo.dump(extra_data, f)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\main.py", line 451, in dump
    return self.dump_all([data], stream, _kw, transform=transform)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\main.py", line 465, in dump_all
    self._context_manager.dump(data)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\main.py", line 817, in dump
    self._yaml.representer.represent(data)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\representer.py", line 85, in represent
    node = self.represent_data(data)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\representer.py", line 112, in represent_data
    node = self.yaml_representers[data_types[0]](self, data)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\representer.py", line 360, in represent_dict
    return self.represent_mapping(u'tag:yaml.org,2002:map', data)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\representer.py", line 1018, in represent_mapping
    node_value = self.represent_data(item_value)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\representer.py", line 112, in represent_data
    node = self.yaml_representers[data_types[0]](self, data)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\representer.py", line 360, in represent_dict
    return self.represent_mapping(u'tag:yaml.org,2002:map', data)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\representer.py", line 1018, in represent_mapping
    node_value = self.represent_data(item_value)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\representer.py", line 122, in represent_data
    node = self.yaml_representers[None](self, data)

  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\ruamel\yaml\representer.py", line 393, in represent_undefined
    raise RepresenterError('cannot represent an object: %s' % (data,))

RepresenterError: cannot represent an object: 72.02042556386013

Note: the above approach is after applying the constant actions in the test_reference_environment_direct_deployment.py, so all empty cells from 2031 to 2050 in "GALE" and "Outputs" are filled; On the other hand, it might work (or not) if we do not apply any action to fill the empty cells from 2031 to 2050 in the compiled "Pathways to Net Zero - Simplified.xlsx" object, and just evaluate all the cells involved in env.py, and then serialize it to hard drive. It might work or not (as the error message above is related to the ruamel package), but writing another script looping through all involved cells and testing it may be slower than using some other serialization solution (such as https://docs.python.org/3/library/pickle.html or https://joblib.readthedocs.io/en/latest/generated/joblib.dump.html, as you mentioned on Slack) to directly serialize the compiled object without any "initialization" by .evaluate().

And @moriartyjm , please let me know which way I should proceed with (probably after implementing the forecast/augmented observation space in https://github.com/rangl-labs/netzerotc/issues/5).

moriartyjm commented 3 years ago

Thanks guys. As I understand it, the problem we're addressing is that:

Would the following simple workaround (which does not introduce the complexity associated with serialization) be faster?

  1. Create a new worksheet in the Excel spreadsheet called Original_Parameters, containing all original costs
  2. Upon env.reset, reinstate all original values by simply copying them across from the Original_Parameters sheet into the sheets where they occur

@jia-chenhua , if this seems fast enough, please go ahead and implement it. It will be useful as a short-term workaround, at least. Longer term I guess it would be nice to also have the option of serialisation.

jia-chenhua commented 3 years ago

@moriartyjm and @masonlr , great news: I've finally resolved it, by looping through all cells involved in env.py to evaluate them all, and then serializing it to hard drive (using https://github.com/rangl-labs/netzerotc/blob/17aa28bd11c0c5525826c4426dca83e50025f572/environment/compiled_workbook_objects/evaluate_all_and_serialize.py), without the error message I mentioned above (when serializing after running test_reference_environment_direct_deployment.py), except for the following division-by-0 warnings due to empty cells in 'BREEZE' and 'GALE' from 2031 to 2050 (which we intentionally removed from the simplified workbook, and it should be fine):

Pathways2Net0.to_file("PathwaysToNetZero_Simplified_Compiled")
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P36") * (_C_("BREEZE!T36") / _C_("BREEZE!R36"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P37") * (_C_("BREEZE!T37") / _C_("BREEZE!R37"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P38") * (_C_("BREEZE!T38") / _C_("BREEZE!R38"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P39") * (_C_("BREEZE!T39") / _C_("BREEZE!R39"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P40") * (_C_("BREEZE!T40") / _C_("BREEZE!R40"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P41") * (_C_("BREEZE!T41") / _C_("BREEZE!R41"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P42") * (_C_("BREEZE!T42") / _C_("BREEZE!R42"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P43") * (_C_("BREEZE!T43") / _C_("BREEZE!R43"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P44") * (_C_("BREEZE!T44") / _C_("BREEZE!R44"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P45") * (_C_("BREEZE!T45") / _C_("BREEZE!R45"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P46") * (_C_("BREEZE!T46") / _C_("BREEZE!R46"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P47") * (_C_("BREEZE!T47") / _C_("BREEZE!R47"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P48") * (_C_("BREEZE!T48") / _C_("BREEZE!R48"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P49") * (_C_("BREEZE!T49") / _C_("BREEZE!R49"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P50") * (_C_("BREEZE!T50") / _C_("BREEZE!R50"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P51") * (_C_("BREEZE!T51") / _C_("BREEZE!R51"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P52") * (_C_("BREEZE!T52") / _C_("BREEZE!R52"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P53") * (_C_("BREEZE!T53") / _C_("BREEZE!R53"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P54") * (_C_("BREEZE!T54") / _C_("BREEZE!R54"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("BREEZE!P55") * (_C_("BREEZE!T55") / _C_("BREEZE!R55"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P36") * (_C_("GALE!T36") / _C_("GALE!R36"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P37") * (_C_("GALE!T37") / _C_("GALE!R37"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P38") * (_C_("GALE!T38") / _C_("GALE!R38"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P39") * (_C_("GALE!T39") / _C_("GALE!R39"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P40") * (_C_("GALE!T40") / _C_("GALE!R40"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P41") * (_C_("GALE!T41") / _C_("GALE!R41"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P42") * (_C_("GALE!T42") / _C_("GALE!R42"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P43") * (_C_("GALE!T43") / _C_("GALE!R43"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P44") * (_C_("GALE!T44") / _C_("GALE!R44"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P45") * (_C_("GALE!T45") / _C_("GALE!R45"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P46") * (_C_("GALE!T46") / _C_("GALE!R46"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P47") * (_C_("GALE!T47") / _C_("GALE!R47"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P48") * (_C_("GALE!T48") / _C_("GALE!R48"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P49") * (_C_("GALE!T49") / _C_("GALE!R49"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P50") * (_C_("GALE!T50") / _C_("GALE!R50"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P51") * (_C_("GALE!T51") / _C_("GALE!R51"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P52") * (_C_("GALE!T52") / _C_("GALE!R52"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P53") * (_C_("GALE!T53") / _C_("GALE!R53"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: division by zero
Eval: _C_("GALE!P54") * (_C_("GALE!T54") / _C_("GALE!R54"))
Values: 0 Div 0
Traceback (most recent call last):
  File "\WinPython64-3.7.7.1\python-3.7.7.amd64\lib\site-packages\pycel\excelutil.py", line 1269, in fixup
    return PYTHON_AST_OPERATORS[op](left_op, right_op)
ZeroDivisionError: float division by zero
Eval: _C_("GALE!P55") * (_C_("GALE!T55") / _C_("GALE!R55"))
Values: -380.262 Div 0

Despite the above warnings, the .pkl and .yml files are successfully serialized to hard drive, where the .yml file contains the values and formulae involved during looping through all cells involved in env.py. So these warnings above should be fine, since all formulae are stored in the serialized files, and once we set values/apply actions to fill the empty cells, it should get the correct values after .evaluate(). And this also passed the fixed policy test by running test_reference_environment_direct_deployment.py up to https://github.com/rangl-labs/netzerotc/blob/17aa28bd11c0c5525826c4426dca83e50025f572/environment/test_reference_environment_direct_deployment.py#L46, which now reads the extracted total capex, opex, revenue, carbon tax from the fixed-policy-filled original workbook (original xlsx, not the simplified xlsx, so the capex could depend on future, and hence the total capex doesn't match), and the differences between Excel workbook extracted values and env.py's output - for total opex, revenue, carbon tax - are in the order of 10^(-3) ~ 10^(-4), which I hope is acceptable (Note: please comment out the lines in https://github.com/rangl-labs/netzerotc/blob/17aa28bd11c0c5525826c4426dca83e50025f572/environment/reference_environment_direct_deployment/env.py#L250 and uncomment the https://github.com/rangl-labs/netzerotc/blob/17aa28bd11c0c5525826c4426dca83e50025f572/environment/reference_environment_direct_deployment/env.py#L251 to avoid randomized costs/prices, otherwise the fixed policy test using test_reference_environment_direct_deployment.py will fail).

Now, regarding the serialized .yml and .pkl: the .pkl is about ~2.4MB, containing non-readable text, which I think should be safe to push to GitHub; the .yml is about 236KB, containing information like the following:

cycles: false
excel_hash: 187d6d716602d4b366cc5b7d510ff9c1
cell_map:
  BREEZE!B25: 2020
  BREEZE!B26: 2021
...
  BREEZE!C24: Nuclear
  BREEZE!C25: =_C_("UK 2019!D7")
  BREEZE!C26: =_C_("BREEZE!C25") + ((_C_("BREEZE!C55") - _C_("BREEZE!C25")) / count(_R_("BREEZE!B25:B54")))
...
  GALE!P24: Capacity GW
  GALE!P25: 11.15
  GALE!P26: 12.53
...
  GALE!P35: 35.96
  GALE!P36:
  GALE!P37:
...
  GALE!AB24: GW green hydrogen
  GALE!AB25: =_C_("GALE!P25") * (_C_("GALE!T25") / _C_("GALE!R25"))
  GALE!AB26: =_C_("GALE!P26") * (_C_("GALE!T26") / _C_("GALE!R26"))
...
  GALE!AB36: =_C_("GALE!P36") * (_C_("GALE!T36") / _C_("GALE!R36"))
...
  Outputs!P24: =sum_(_R_("Outputs!P185:P186"), _R_("Outputs!P190:P191"))
  Outputs!P25: =_C_("Outputs!P187") + _C_("Outputs!P192")
  Outputs!P26: =_C_("Outputs!P236")
...
  Outputs!AI237: =_C_("Outputs!AI229") * _C_("Outputs!AI93")
...
  UK 2019!G57: 1.75641112725255
  UK 2019!G58: =sum_(_R_("UK 2019!G47:G57"))
filename: Pathways to Net Zero - Simplified.xlsx

So essentially, this .yml contains all cells involved in env.py with the column & row indices and either raw values or formulae. As you can see, some cells are empty because we removed them from the simplified xlsx file, and some formulae contain division and so it will raise a division-by-0 warning if an empty cell becomes the denominator (see, e.g., GALE!AB36: =_C_("GALE!P36") * (_C_("GALE!T36") / _C_("GALE!R36")) above and also the cell 'AB36' in the 'GALE' spreadsheet of "Pathways to Net Zero - Simplified.xlsx" on MS Teams or in your local folder). Since this .yml contains the full human-readable information of cells involved in env.py and their relations in terms of formulae, @moriartyjm , please let me know if I should push this .yml together with the .pkl to GitHub. Thanks!

jia-chenhua commented 3 years ago

Note: speed information when loading/de-serializing the .yml and .pkl files using https://github.com/rangl-labs/netzerotc/blob/17aa28bd11c0c5525826c4426dca83e50025f572/environment/compiled_workbook_objects/evaluate_all_and_serialize.py#L31:

start = time.time()
Pathways2Net0_Loaded = ExcelCompiler.from_file("PathwaysToNetZero_Simplified_Compiled")
end = time.time()
print(f"INFO: took {end - start} seconds to load from serialised file")
INFO: took 0.21702003479003906 seconds to load from serialised file

On average it could take 0.2 ~ 0.25 second to load. @moriartyjm , please let me know if this is still acceptable for env.reset purpose, which now invokes the Parameters() to initialize the env.param.

masonlr commented 3 years ago

Nice work!

On average it could take 0.2 ~ 0.25 second to load. @moriartyjm , please let me know if this is still acceptable for env.reset purpose, which now invokes the Parameters() to initialize the env.param.

There might be workarounds for this. When we first create the env object we can load and pay the 0.2 second penalty. The env.reset() can mutate the pre-loaded pycel state – which I think will be an order of magnitude faster, but we need to test.

masonlr commented 3 years ago

One tip: when linking to files like https://github.com/rangl-labs/netzerotc/blob/direct_action_on_deployment/environment/compiled_workbook_objects/evaluate_all_and_serialize.py it's handy to make a more permanent link.

You can do this by pushing "y" in GitHub when you visit the link.

For example, the link above turns into https://github.com/rangl-labs/netzerotc/blob/17aa28bd11c0c5525826c4426dca83e50025f572/environment/compiled_workbook_objects/evaluate_all_and_serialize.py

The general problem this addresses is that the first type of link points to the latest commit on a branch. If we update the branch then the link might end up pointing to updated content – this makes it confusing to revisit comment threads in the future when troubleshooting.

jia-chenhua commented 3 years ago

One tip: when linking to files like https://github.com/rangl-labs/netzerotc/blob/direct_action_on_deployment/environment/compiled_workbook_objects/evaluate_all_and_serialize.py it's handy to make a more permanent link.

You can do this by pushing "y" in GitHub when you visit the link.

For example, the link above turns into https://github.com/rangl-labs/netzerotc/blob/17aa28bd11c0c5525826c4426dca83e50025f572/environment/compiled_workbook_objects/evaluate_all_and_serialize.py

The general problem this addresses is that the first type of link points to the latest commit on a branch. If we update the branch then the link might end up pointing to updated content – this makes it confusing to revisit comment threads in the future when troubleshooting.

Thanks. I knew the difference, but just forgot to use the link specific to that commit last Friday evening. Now it's fixed.

jia-chenhua commented 3 years ago

Nice work!

On average it could take 0.2 ~ 0.25 second to load. @moriartyjm , please let me know if this is still acceptable for env.reset purpose, which now invokes the Parameters() to initialize the env.param.

There might be workarounds for this. When we first create the env object we can load and pay the 0.2 second penalty. The env.reset() can mutate the pre-loaded pycel state – which I think will be an order of magnitude faster, but we need to test.

This has been implemented in https://github.com/rangl-labs/netzerotc/blob/3cd0edbfdc58f0522fa8b755ed71557535411fee/environment/reference_environment_direct_deployment/env.py#L340, as an alternative to https://github.com/rangl-labs/netzerotc/blob/3cd0edbfdc58f0522fa8b755ed71557535411fee/environment/reference_environment_direct_deployment/env.py#L424, in case that taking 0.2 ~ 0.25 seconds to load the serialized .pkl is too slow/not acceptable when creating a new Parameters().

Note: this current implementation assumes that the simplified Pathways to Net Zero xlsx file contains spreadsheets 'GALE_Backup', 'CCUS_Backup', 'Outputs_Backup' which are duplicated from spreadsheets 'GALE', 'CCUS', 'Outputs' before they are filled with actions in deployments numbers or randomized in the costs/prices, such that 'GALE_Backup', 'CCUS_Backup', 'Outputs_Backup' contain the original blank/empty or pre-randomized values.