OSeMOSYS / otoole

OSeMOSYS Tools for Energy
https://otoole.readthedocs.io
MIT License
23 stars 17 forks source link

Validation of config.yaml against datafile #182

Open willu47 opened 1 year ago

willu47 commented 1 year ago

Duplicate of #160. See also #151

Issue #179 highlighted that if the list of parameters and sets in the config.yaml file does not match what is in the data file, otoole returns a cryptic error message from Amply.

It is currently not possible to validate parameters and sets against what is in a datafile as Amply requires these to parse the datafile.

There are some hacky ways to extract parameters and sets from datafiles using string matching or regex, but it's fragile.

Catching the Amply error and returning a more useful error in otoole e.g. "DatafileParseError: Please check that the config file provided matches the parameters and sets in your datafile" maybe a quicker route to something usable...

trevorb1 commented 1 year ago

Hi @willu47! I have a question on this issue. I think I have implemented the solution for catching the Amply error, however, I am wondering what general logic we want implement for when a config file does not match the input data.

If there are inconsistencies with the reading in of data (for example, if the config file has a parameter/set defined which is not in the input data, or visa versa), do we still want to read in the data and just print a warning printed to the user? Or should we halt the conversion all together and raise a warning. Right now the current logic is shown in PR #157.

I heard from a few people that when the OtooleNameMismatchError is raised:

  1. It's a confusing name. I have addressed this in the upcoming PR for this issue
  2. If they have extra data in a folder of CSVs, otoole will no longer work with their workflow. I guess my thought when I implemented this error message was that we should be checking for perfectly matching data and config files. But maybe this is to strict of a requirement and it should be relaxed to only print warnings if the config file and input data do not match? Or maybe we allow users to bypass this error through the use of a --skip_input_check flag (or something similar to this)?

Do you have any thoughts on these questions? Cause right now I think I am going in circles with what we want to implement haha.