Features/comparison script

jnnr commented 4 years ago

This PR introduces a script that allows to compare results to other model results, yielding the absolute and relative differences.

unndreay commented 4 years ago

Thank you for this tool!

Unfortunately, it does not work on my machine. I get this CSV for a comparison with "DIETER" and use case 2d. Dieter's column is empty although Dieter has data for this use case.

UseCase	Year	Parameter	value_oemof
FlexMex1_2d	2050	EnergyConversion_Capacity_Electricity_Nuclear_ST	28826
FlexMex1_2d	2050	EnergyConversion_Curtailment_Electricity_RE	29782
FlexMex1_2d	2050	EnergyConversion_FuelCosts_Electricity_Nuclear_ST	796813372
FlexMex1_2d	2050	EnergyConversion_Invest_Electricity_Nuclear_ST	13729290462
FlexMex1_2d	2050	EnergyConversion_SecondaryEnergy_Electricity_Nuclear_ST	84657
FlexMex1_2d	2050	EnergyConversion_SecondaryEnergy_Electricity_RE	189794
FlexMex1_2d	2050	EnergyConversion_SecondaryEnergy_Electricity_Slack	198
FlexMex1_2d	2050	EnergyConversion_StartUpCosts_ALL
FlexMex1_2d	2050	EnergyConversion_StartUpFuelCosts_ALL
FlexMex1_2d	2050	EnergyConversion_VarOM_Electricity_Nuclear_ST	770376692
FlexMex1_2d	2050	EnergyConversion_WaT_Electricity_Nuclear_ST
FlexMex1_2d	2050	Energy_Cost_System	42971116528
FlexMex1_2d	2050	Transmission_Capacity_Electricity_Grid
FlexMex1_2d	2050	Transmission_FixOM_Electricity_Grid
FlexMex1_2d	2050	Transmission_Flows_Electricity_Grid	8145
FlexMex1_2d	2050	Transmission_Invest_Grid
FlexMex1_2d	2050	Transmission_Losses_Electricity_Grid	47

unndreay commented 4 years ago

There is a flaw in Dieter's Scalars.csv:

UseCase	Region	Year	Modell	Parameter	Unit	Value
Flexmex1_2d	ALL	2050	DIETER	Energy_Cost_System	Eur	380323143085
Flexmex1_2d	AT	2050	DIETER	EnergyConversion_Capacity_Electricity_Nuclear_ST	MW	8962
Flexmex1_2d	BE	2050	DIETER	EnergyConversion_Capacity_Electricity_Nuclear_ST	MW	22355
Flexmex1_2d	CH	2050	DIETER	EnergyConversion_Capacity_Electricity_Nuclear_ST	MW	9709

All the data for Use case 2d (and only those of use case 2d) start with Flexmex instead of FlexMex.

unndreay commented 4 years ago

Something strange is going on in the calculation. If I recalculate it with Libreoffice, I get different values (see last column). I stumbled over it because I couldn't see where the -inf values come from.

Parameter	value_oemof	value_DIETER	abs_diff	rel_diff	rel_diff_my_calc
EnergyConversion_Capacity_Electricity_Nuclear_ST	28826	36554	-7728	-33	-26,81
EnergyConversion_Curtailment_Electricity_RE	29782	25749	4033	-inf	13,54
EnergyConversion_FixOM_Electricity_Nuclear_ST		0
EnergyConversion_FuelCosts_Electricity_Nuclear_ST	796813372
EnergyConversion_Invest_Electricity_Nuclear_ST	13729290462	17409881758	-3680591295	-33	-26,81
EnergyConversion_SecondaryEnergy_Electricity_Nuclear_ST	84657	82354	2303	2	2,72
EnergyConversion_SecondaryEnergy_Electricity_RE	189794	193828	-4034	8	-2,13
EnergyConversion_SecondaryEnergy_Electricity_Slack	198	14	184	-inf	92,93
EnergyConversion_StartUpCosts_ALL
EnergyConversion_StartUpFuelCosts_ALL
EnergyConversion_VarOM_Electricity_Nuclear_ST	770376692	749420759	20955933	2	2,72
EnergyConversion_WaT_Electricity_Nuclear_ST		35176026
Energy_Cost_System	42971116528	380323143085	-337352026557	-785	-785,07
Transmission_Capacity_Electricity_Grid		5563
Transmission_FixOM_Electricity_Grid		11196038
Transmission_Flows_Electricity_Grid	8145	5532	285	-inf	32,08
Transmission_Invest_Electricity_Grid		44200272
Transmission_Invest_Grid
Transmission_Losses_Electricity_Grid	47	459	-432	-inf	-876,60

unndreay commented 4 years ago

Suggestions:

I would prefer to take the other model as the benchmark. ~~So, an abs. diff. of -7728 would tell something about our model (too low). With ours as the benchmark, you always have to switch in your mind (-7728 means ours is to high!).~~ EDIT: Sorry, already the case, but inconsistent with the formula (a-b)/a. Should be (a-b)/b then. And columns swapped.
When reading about a relative difference in the column title I would expect to find a decimal value instead of a percentage (esp. in the case of a relative deviation which we have here).

What do you think?

jnnr commented 4 years ago

Suggestions:

1. I would prefer to take the other model as the benchmark. So, an abs. diff. of -7728 would tell something about _our_ model (too low). With ours as the benchmark, you always have to switch in your mind (-7728 means ours is to high!).

I agree. In the example you posted above this seems already the case. Did you adapt it already? If not, feel free to make a commit.

2. When reading about a relative difference in the column title I would expect to find a _decimal_ value instead of a percentage (esp. in the case of a relative deviation which we have here).

As there is no indication of units, I agree. Feel free to change it to decimals.

jnnr commented 4 years ago

Something strange is going on in the calculation. If I recalculate it with Libreoffice, I get different values (see last column). I stumbled over it because I couldn't see where the -inf values come from.

Could you find a reason why you get different values? Edit: It occurs because of taking the relative diff before averaging. I swapped the order in https://github.com/modex-flexmex/oemo-flex/pull/70/commits/dc41e1b5d4a8ca4c0fa078e060829fc724781277

unndreay commented 4 years ago

Another error:

Comparing usecase FlexMex1_2b.
Traceback (most recent call last):
  File "/home/unndreay/Workspaces/oemo-flex/experiment_1/scripts/compare_scalars.py", line 101, in <module>
    diff = calculate_diff_and_relative_deviation(sc_oemof, sc_compare)
  File "/home/unndreay/Workspaces/oemo-flex/experiment_1/scripts/compare_scalars.py", line 76, in calculate_diff_and_relative_deviation
    diff = pd.concat([a, b, abs_diff, rel_diff], 1)
  File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 284, in concat
    return op.get_result()
  File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 475, in get_result
    df = cons(data, index=index)
  File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/core/frame.py", line 435, in __init__
    mgr = init_dict(data, index, columns, dtype=dtype)
  File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 254, in init_dict
    return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
  File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 69, in arrays_to_mgr
    arrays = _homogenize(arrays, index, dtype)
  File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 311, in _homogenize
    val = val.reindex(index, copy=False)
  File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/core/series.py", line 4030, in reindex
    return super().reindex(index=index, **kwargs)
  File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/core/generic.py", line 4544, in reindex
    axes, level, limit, tolerance, method, fill_value, copy
  File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/core/generic.py", line 4559, in _reindex_axes
    labels, level=level, limit=limit, tolerance=tolerance, method=method
  File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/core/indexes/multi.py", line 2416, in reindex
    raise ValueError("cannot handle a non-unique multi-index!")
ValueError: cannot handle a non-unique multi-index!

jnnr commented 4 years ago

Another error:

~~Cannot reproduce this - for me it works. Are your results data ok?~~ Update: This is because of duplicate entries for Energy_Cost_System in usecase 2b. It can be solved by deleting the duplicate row from the results template Scalars.csv.

modex-flexmex / oemof-flexmex

Features/comparison script #70