OSeMOSYS / otoole

OSeMOSYS Tools for Energy
https://otoole.readthedocs.io
MIT License
25 stars 19 forks source link

Long parameter names cause an error in writing to an Excel spreadsheet #131

Closed trevorb1 closed 1 year ago

trevorb1 commented 1 year ago

OVERVIEW This PR adds a short_name value to the config.yaml which will overwrite the full parameter name when writing user defined parameters to an excel file.

LOGIC Adds a new utility function called create_name_mappings( ... ) which will create a mapping dictionary similar to the previous EXCEL_TO_CSV hard coded dictionary. This function is called in the ReadExcel class only. When writing to excel, the logic just checks if there is a short_name value associated with the parameter or not, and writes that out.

QUESTIONS

  1. The downside of only calling create_name_mappings( ... ) when reading is that we do not log to the user if their parameter name is longer then 31 characters. I guess my justification for doing this is I tried to keep the logging within the create_name_mappings( ... ) function. Do you think I should break the logging out from this function, or change the logic so create_name_mappings( ... ) is called when writing to Excel as well?

  2. The current logic does not check if all excel tabs are in the configuration file. This could be an issue if the user modifies an excel tab name, but does not update the short_name in the configuration file. Do you think this is worth checking for? Or just leave it for the time being?

  3. The return type of the create_name_mappings( ... ) function should be Dict[str, str], but it really wasn't happy when I used this type. It sees the input config file as a Dict[str, Dict[str, Union[str, List]]]. Then at the end I use a comprehension to determine what string should be the key and value. But it just keeps throwing this error, with the types changing location due to the comprehension swapping their places.

src/otoole/utils.py:145: error: Incompatible return value type (got "Dict[str, Union[str, List[Any]]]", expected "Dict[str, str]")
src/otoole/utils.py:148: error: Incompatible return value type (got "Dict[Union[str, List[Any]], str]", expected "Dict[str, str]")

For the time being I left the return type as just a Dict. My question is, is there a way around this, or should I not be changing the order of return types? But then I'm a little confused on why it is seeing Dict[Union[str, List[Any]] as one of the types, since checking the logic it comes out to be a string? I may take another look at this tomorrow haha.

Thanks !

Closes #128

willu47 commented 1 year ago

The downside of only calling create_name_mappings( ... ) when reading is that we do not log to the user if their parameter name is longer than 31 characters. I guess my justification for doing this is I tried to keep the logging within the create_name_mappings( ... ) function. Do you think I should break the logging out from this function, or change the logic so create_name_mappings( ... ) is called when writing to Excel as well?

I see. Yes, the constraint on short names is linked to reading/writing from/to Excel, so it makes sense that the validation of the configuration file is linked to the classes responsible for reading or writing Excel.

However, I think we will need a config validation module eventually - and perhaps this is a good time to start one? This could be a separate pull-request/issue. Prior to allowing user-config files, validation was less of an issue, but it could be quite frustrating for users now if they have mistakes in the config file, and resulting error messages from otoole are cryptic...

The current logic does not check if all excel tabs are in the configuration file. This could be an issue if the user modifies an excel tab name, but does not update the short_name in the configuration file. Do you think this is worth checking for? Or just leave it for the time being?

Yes, this is worth checking for. A read error would be caused if:

OR

The return type of the create_name_mappings( ... ) function should be Dict[str, str], but it really wasn't happy when I used this type. It sees the input config file as a Dict[str, Dict[str, Union[str, List]]]. Then at the end I use a comprehension to determine what string should be the key and value. But it just keeps throwing this error, with the types changing location due to the comprehension swapping their places.

This is a bit strange. It could be because a string is also seen as a list of characters? Or, it could be because the config file also contains non string types such as integers? Again, validation of the config file might help reduce these issues.

trevorb1 commented 1 year ago

I really like the idea of creating a separate config validation module! I have started an issue on this here #133

I have updated the code in this PR to address your suggestions on my second question. Specifically, I added two new exceptions and a method to the ReadExcel class;

I was going to check when reading excel files that the tab name is less then 31 characters, but excel won't let you create these. So I don't think there is point of us checking for that. Also, I have never written custom exceptions before, so if I'm not following best practices with them, please let me know and I can push an update :)

EXAMPLE

If the name TotalAnnualMaxCapacityInvestment does not have a short_name attached to it in the config file, OR the short_name is still longer then 31 characters

(otoole) trevorb1@DESKTOP-M23U3I0:~/repositories/otoole/trevor$ otoole convert datapackage excel datapackage.json test.xlsx config.yaml 
OtooleExcelNameLengthError: TotalAnnualMaxCapacityInvestment -> Parameter name must be less then 31 characters when writing to Excel

If the excel file has a tab called AcumulatedAnnualDemand (typo - missing second c in Accumulated)

(otoole) trevorb1@DESKTOP-M23U3I0:~/repositories/otoole/trevor$ otoole convert excel datafile test.xlsx data.txt config.yaml 
OtooleExcelNameMismatchError: AcumulatedAnnualDemand -> Excel tab name not found in config file

OR the config file has the typo, and the excel tab is spelled correctly

(otoole) trevorb1@DESKTOP-M23U3I0:~/repositories/otoole/trevor$ otoole convert excel datafile test.xlsx data.txt config.yaml 
OtooleExcelNameMismatchError: AccumulatedAnnualDemand -> Excel tab name not found in config file
trevorb1 commented 1 year ago

Oops. Thanks for pointing out I forgot to test the exceptions. I added those to the tests_utils.py as you suggested following the documentation you linked. If your happy with that, we should be good to merge :)