OSeMOSYS / otoole

OSeMOSYS Tools for Energy
https://otoole.readthedocs.io
MIT License
25 stars 19 forks source link

Cannot convert model to Excel #114

Closed willu47 closed 1 year ago

willu47 commented 3 years ago

When converting to Excel using otoole v0.11.0 I get the following error:

Traceback (most recent call last):
  File "/Users/wusher/miniconda3/envs/otoole38/bin/otoole", line 33, in <module>
    sys.exit(load_entry_point('otoole', 'console_scripts', 'otoole')())
  File "/Users/wusher/repository/otoole/src/otoole/cli.py", line 299, in main
    args.func(args)
  File "/Users/wusher/repository/otoole/src/otoole/cli.py", line 173, in conversion_matrix
    context.convert(args.from_path, args.to_path)
  File "/Users/wusher/repository/otoole/src/otoole/input.py", line 123, in convert
    self._write(inputs, output_filepath, default_values)
  File "/Users/wusher/repository/otoole/src/otoole/input.py", line 112, in _write
    self._write_strategy.write(inputs, filepath, default_values)
  File "/Users/wusher/repository/otoole/src/otoole/input.py", line 266, in write
    self._write_parameter(df, name, handle, default=default_value)
  File "/Users/wusher/repository/otoole/src/otoole/write_strategies.py", line 80, in _write_parameter
    df.to_excel(handle, sheet_name=name, merge_cells=False, index=True)
  File "/Users/wusher/miniconda3/envs/otoole38/lib/python3.8/site-packages/pandas/core/generic.py", line 2189, in to_excel
    formatter.write(
  File "/Users/wusher/miniconda3/envs/otoole38/lib/python3.8/site-packages/pandas/io/formats/excel.py", line 821, in write
    writer.write_cells(
  File "/Users/wusher/miniconda3/envs/otoole38/lib/python3.8/site-packages/pandas/io/excel/_openpyxl.py", line 410, in write_cells
    for cell in cells:
  File "/Users/wusher/miniconda3/envs/otoole38/lib/python3.8/site-packages/pandas/io/formats/excel.py", line 757, in get_formatted_cells
    for cell in itertools.chain(self._format_header(), self._format_body()):
  File "/Users/wusher/miniconda3/envs/otoole38/lib/python3.8/site-packages/pandas/io/formats/excel.py", line 576, in _format_header_regular
    coloffset = len(self.df.index[0])
  File "/Users/wusher/miniconda3/envs/otoole38/lib/python3.8/site-packages/pandas/core/indexes/multi.py", line 2028, in __getitem__
    if level_codes[key] == -1:
IndexError: index 0 is out of bounds for axis 0 with size 0
HauHe commented 3 years ago

Have you made any progress on this @willu47 ? Or an idea what might be the issue?

jabranesl commented 2 years ago

the error is still present if someone has been able to solve it.

trevorb1 commented 1 year ago

Hi! I was looking at this issue today and have a quick question! I see that the error is getting caused by pandas trying to write out empty multiindex dataframes. I also see that @willu47 pushed a fix for this to the develop branch (see code snippet below). In the fix, we simply pass over empty dataframes and log this.

https://github.com/OSeMOSYS/otoole/blob/c4ea4ddcf5262ba94ee08a6680391cf30aee5546/src/otoole/write_strategies.py#L84-L88

I guess my question is, is this the desired functionality? Or are we actually wanting to write out a blank template excel sheet that the user can then go enter values in? My initial thought was we want a blank template, but thinking about it a little, that may be easier said then done haha.

The code to write out an empty sheet using the input df is quite straightforward, as pandas just doesn't like the empty df with multiindex. This can be bypased with:

df = self._form_parameter(df, parameter_name, default)
if not df.empty:
    df.to_excel(handle, sheet_name=name, merge_cells=False, index=True)
else:
    df_empty = df.reset_index(drop=False)
    df_empty.to_excel(handle, sheet_name=name, index=False)
    logger.info(f"Skipped writing {parameter_name} as it is empty")

However, this will just print out the set headers, with an additional VALUE column (long formatted data I think). For example, the excel sheet may look like:

REGION TECHNOLOGY YEAR VALUE

Whereas, for excel we want the data pivoted on the YEAR column (wide format?). My first thought was we could just replace the YEAR and VALUE columns with all the years, but not populate any default values to not needlessly increase the file size (example below). However, we don't have access to the set data when writing out the parameters, so getting the years won't be a very clean solution?

REGION TECHNOLOGY 2020 2021 2022 ...

https://github.com/OSeMOSYS/otoole/blob/c4ea4ddcf5262ba94ee08a6680391cf30aee5546/src/otoole/write_strategies.py#L69-L75

Do you think trying to template out these excel sheets is worth it, or just leave it for the time being? Or maybe we just create the sheets but leave them blank, so at least the users know what parameters are available to them. Then with issue #107 and #113 it will be up to the user to correctly populate the sheet? OR, if you think we should just leave it for the time being, I think this issue is good to be closed?

Thanks!

willu47 commented 1 year ago

Hi @trevorb1 - thank you for yet another very clearly written issue! In short, we'll want to write out as helpful a template as possible for the user. If we can include the year index for the headers, that would be great. I guess there's a more nefarious potential bug lurking there if a user has only a subset of the years in the parameter file being written out. Ideally, we would write out columns populated with the default value for those missing values...

trevorb1 commented 1 year ago

Closed with PR #145. When pivoting the excel files all years are grabbed from the YEAR set. If no years are provided, there are no pivoted columns, but the other set definitions are still written out.