Closed trevorb1 closed 1 year ago
Hi @willu47! Good call, I think that makes way more sense! I have updated this PR to generalize the _get_missing_params( ... )
function (issue #99), and provide excel templates for missing parameters (issue #114). These two issues are grouped into one PR since they cover similar issues; however, if quite a few more changes needs to happen from this point I am happy to break them up.
In the commit "added function to get missing input values", I move the _get_missing_params( ... )
function to the ReadStrategy
class, rename it to _get_missing_input_dataframes(...)
, and generalize it to work with any of param
, sets
, and result
types. A test is added as well to the TestReadStrategy
class.
_get_missing_input_dataframes(...)
will create empty, but correctly formatted, dataframes for any missing param
, set
or result
that is listed in the config file but is not found in the internal datastore after reading in data. To keep this general, I have the user pass in an argument for what parameter to search for (param
, set
or result
). My question is, I often have to loop over two of these items (see comment below) which seems a little clunky. Do you think there is a way to simplify this or change the logic at all? My first thought was to just accept a list of inputs to loop over, instead of a single string. However I just move the loop to inside the function which I'm not sure is any better haha 😅
In the commit "added excel template function", I add a new function to the WriteExcel
class called _form_parameter_template(...)
which will create wide formatted template dataframes when writing out params that do not have data. A test for this is also added.
To create the wide formatted data, I needed to pass in the input_data
to correctly index the columns as years. This involved changing the WriteStrategy.write(...)
method to accept keyword arguments to allow for the passing of input_data
. While I don't think this was a bad solution, Im not sure how clean it is either. I couldn't think of any other way to get the years for long formatted data to the WriteExcel
class though?
I slightly changed the logic of the the _form_parameter(...)
method. I moved the check for empty dataframes out of this function and to where the function is called in the WriteExcel.write_parameter(...)
method. While this means _form_parameter(...)
will no longer catch if the input df is empty, since it is a private method and only used in this one place, I thought it improved readability. My question is, does this tradeoff make sense? Or is it better to keep those high level quick checks in the function?
Hello!
Overview
In this PR I implement logic in the
ReadExcel
class to create empty dataframes for parameters that do not have an associated excel sheet. This PR relates to issue #99, and this comment in issue #114.Problem Addressed
I was reviewing issue #99 and encountered a very similar issue to the one described. When we convert from anything to an Excel file, only parameters that have non-default values are written. For example, in Simplicity the DiscountRate is given by a default value, therefore, no
DiscountRate
sheet will appear after running the commandotoole convert <> excel <> params.xlsx config.yaml
.In a different comment (see issue #114) I discuss if this is actually the functionality we want. Or if we should be writing out empty sheets when the parameter does not have values to help guide users on how to input data.
Nevertheless, with the current functionality, when reading in the Excel data it is not capturing parameters without dedicated sheets. For example, if no
DiscountRate
sheet exists in the excel file, then noDiscountRate
key will be created in the otoole internal datastore. While this doesn't cause any otoole logic issues, no reference to the missing parameters will be written. This is problematic, as, for example, you will have an OSeMOSYS datafile without aDiscountRate
parameter.Logic
To fix this, I add a new function to the
ReadExcel
class called_get_missing_params( ... )
. This function takes in the parameters read in from excel and compares them against all defined parameters in theconfig.yaml
file. If a parameter is missing in the excel data, an empty dataframe is created and added to the internal datastore. Two tests are added to thetest_read_strategies.py
module to accompany this logic.Other notes
Currently, this only checks parameters. A minor modification is needed if we want it to check for missing sets as well. However, I decided to hold off on adding that until we discussed if this is the correct path to go down to fix this issue :)
Thanks!