polca / premise

Coupling Integrated Assessment Models output with Life Cycle Assessment.
BSD 3-Clause "New" or "Revised" License
101 stars 45 forks source link

improve logic in external scenario validation when comparing scenarios from datapackage and from scenario_data.csv #95

Closed CHarpprecht closed 1 year ago

CHarpprecht commented 1 year ago

Hi @romainsacchi ,

if I understand correctly, in the file external_data_validation.py (lines 486 - 492), the scenario names defined in the datapackage.json are compared to the scenario names provided in the scenario_data.csv:

available_scenarios = df["scenario"].unique()
if not all(
    s in scenarios for s in available_scenarios
):  # check that all scenarios are available in the scenario file
    raise ValueError(
        f"One or several scenarios are not available in the scenario file no. {i + 1}."
    )

I think, 'scnearios' and 'available_scenarios' should be interchanged to:

if not all(
    s in available_scenarios for s in scenarios 
):

That would make it possible to have several scenarios listed in the file scenario_data.csv (as available_scenarios) but to e.g. only use only one of them for creating a new DB. With the current code, this raises a ValueError and requires to make a new scenario_data.csv file for different DB-creations, which is a bit unpractical.

romainsacchi commented 1 year ago

Sure, that sounds like a good idea. You can send a PR that I will accept :-)