wknoben / hydrobm

Python package for calculating simple benchmarks from hydroclimatic timeseries.
GNU General Public License v3.0
5 stars 1 forks source link

evaluate_bm does not transfer variable names #14

Closed SpieDi closed 4 weeks ago

SpieDi commented 4 weeks ago

https://github.com/wknoben/hydrobm/blob/cc899118dc6d895ee7cb058320815afbad3b4ade/hydrobm/calculate.py#L123

This function is missing the transfer of user introduced variable names:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File c:\Users\Diana\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:3805, in Index.get_loc(self, key)
   [3804](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3804) try:
-> [3805](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3805)     return self._engine.get_loc(casted_key)
   [3806](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3806) except KeyError as err:

File index.pyx:167, in pandas._libs.index.IndexEngine.get_loc()

File index.pyx:196, in pandas._libs.index.IndexEngine.get_loc()

File pandas\\_libs\\hashtable_class_helper.pxi:7081, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas\\_libs\\hashtable_class_helper.pxi:7089, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'streamflow'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In[25], [line 34](vscode-notebook-cell:?execution_count=25&line=34)
     [31](vscode-notebook-cell:?execution_count=25&line=31) metrics = ['nse', 'kge', 'mse', 'rmse']
     [33](vscode-notebook-cell:?execution_count=25&line=33) # Calculate the benchmarks and scores
---> [34](vscode-notebook-cell:?execution_count=25&line=34) benchmark_flows,scores = calc_bm(
     [35](vscode-notebook-cell:?execution_count=25&line=35)                             data_xr,
     [36](vscode-notebook-cell:?execution_count=25&line=36)                 
     [37](vscode-notebook-cell:?execution_count=25&line=37)                             # Time period selection
     [38](vscode-notebook-cell:?execution_count=25&line=38)                             cal_mask,
     [39](vscode-notebook-cell:?execution_count=25&line=39)                             val_mask=val_mask,
     [40](vscode-notebook-cell:?execution_count=25&line=40)                 
     [41](vscode-notebook-cell:?execution_count=25&line=41)                             # Variable names in 'data'
     [42](vscode-notebook-cell:?execution_count=25&line=42)                             precipitation="P [mm/d]",
     [43](vscode-notebook-cell:?execution_count=25&line=43)                             streamflow="Q [m3/s]",
     [44](vscode-notebook-cell:?execution_count=25&line=44)                 
     [45](vscode-notebook-cell:?execution_count=25&line=45)                             # Benchmark choices
     [46](vscode-notebook-cell:?execution_count=25&line=46)                             benchmarks=benchmarks,
     [47](vscode-notebook-cell:?execution_count=25&line=47)                             metrics=metrics,
     [48](vscode-notebook-cell:?execution_count=25&line=48)                             optimization_method="brute_force",
     [49](vscode-notebook-cell:?execution_count=25&line=49)                 
     [50](vscode-notebook-cell:?execution_count=25&line=50)                             # Snow model inputs
     [51](vscode-notebook-cell:?execution_count=25&line=51)                             calc_snowmelt=True,
     [52](vscode-notebook-cell:?execution_count=25&line=52)                             temperature="T_mean [°C]",
     [53](vscode-notebook-cell:?execution_count=25&line=53)                             snowmelt_threshold=0.0,
     [54](vscode-notebook-cell:?execution_count=25&line=54)                             snowmelt_rate=3.0,
     [55](vscode-notebook-cell:?execution_count=25&line=55)                         )

File c:\Users\Diana\anaconda3\Lib\site-packages\hydrobm\calculate.py:123, in calc_bm(data, cal_mask, val_mask, precipitation, streamflow, benchmarks, metrics, optimization_method, calc_snowmelt, temperature, snowmelt_threshold, snowmelt_rate)
    [121](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/calculate.py:121) val_scores = []
    [122](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/calculate.py:122) for benchmark_flow in benchmark_flow_list:
--> [123](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/calculate.py:123)     [cal_score, val_score] = evaluate_bm(data, benchmark_flow, metric, cal_mask, val_mask)
    [124](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/calculate.py:124)     cal_scores.append(cal_score)
    [125](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/calculate.py:125)     val_scores.append(val_score)

File c:\Users\Diana\anaconda3\Lib\site-packages\hydrobm\benchmarks.py:1005, in evaluate_bm(data, benchmark_flow, metric, cal_mask, val_mask, streamflow, ignore_nan)
    [974](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:974) """Helper function to calculate calculation and evaluation metric scores for a given
    [975](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:975) set of observations and benchmark flows.
    [976](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:976) 
   (...)
    [999](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:999)     Metric score for the evaluation period. NaN if no val_mask specified.
   [1000](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:1000) """
   [1002](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:1002) # Catch
   [1003](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:1003) 
   [1004](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:1004) # Compute the metric for the calculation period
-> [1005](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:1005) cal_obs = data[streamflow].loc[cal_mask]
   [1006](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:1006) cal_sim = benchmark_flow.loc[cal_mask]  # should have only one column
   [1007](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:1007) assert (
   [1008](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:1008)     cal_obs.index == cal_sim.index
   [1009](file:///C:/Users/Diana/anaconda3/Lib/site-packages/hydrobm/benchmarks.py:1009) ).all(), "Time index mismatch in metric calculation for calculation period"

File c:\Users\Diana\anaconda3\Lib\site-packages\pandas\core\frame.py:4102, in DataFrame.__getitem__(self, key)
   [4100](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/frame.py:4100) if self.columns.nlevels > 1:
   [4101](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/frame.py:4101)     return self._getitem_multilevel(key)
-> [4102](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/frame.py:4102) indexer = self.columns.get_loc(key)
   [4103](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/frame.py:4103) if is_integer(indexer):
   [4104](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/frame.py:4104)     indexer = [indexer]

File c:\Users\Diana\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:3812, in Index.get_loc(self, key)
   [3807](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3807)     if isinstance(casted_key, slice) or (
   [3808](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3808)         isinstance(casted_key, abc.Iterable)
   [3809](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3809)         and any(isinstance(x, slice) for x in casted_key)
   [3810](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3810)     ):
   [3811](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3811)         raise InvalidIndexError(key)
-> [3812](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3812)     raise KeyError(key) from err
   [3813](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3813) except TypeError:
   [3814](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3814)     # If we have a listlike key, _check_indexing_error will raise
   [3815](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3815)     #  InvalidIndexError. Otherwise we fall through and re-raise
   [3816](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3816)     #  the TypeError.
   [3817](file:///C:/Users/Diana/anaconda3/Lib/site-packages/pandas/core/indexes/base.py:3817)     self._check_indexing_error(key)

KeyError: 'streamflow'
wknoben commented 4 weeks ago

PR #15 should fix this