nismod / smif

Simulation Modelling Integration Framework
http://www.itrc.org.uk
MIT License
22 stars 6 forks source link

Make data-reading error comprehensible #301

Closed tomalrussell closed 5 years ago

tomalrussell commented 5 years ago

Error triggered by reading in narrative data with accidentally-duplicate rows:

enduses_service_switch,sector,tech,end_yr,switches_service
rs_lighting,,LED,2030,1.0
rs_lighting,,LED,2030,1.0
ss_lighting,,LED,2050,1.0
ss_lighting,,LED,2050,1.0

Was:

Traceback (most recent call last):
  File "c:\miniconda2\envs\ed\lib\site-packages\smif\controller\scheduler.py", line 178, in add
    self._run(job_graph, job_graph_id) File
    "c:\miniconda2\envs\ed\lib\site-packages\smif\controller\scheduler.py", line 237, in _run
    decision_iteration=job['decision_iteration'] File
    "c:\miniconda2\envs\ed\lib\site-packages\smif\data_layer\data_handle.py", line 62, in
    __init__ self._load_parameters(sos_model, modelrun['narratives']) File
    "c:\miniconda2\envs\ed\lib\site-packages\smif\data_layer\data_handle.py", line 129, in
    _load_parameters narrative_name, variant_name, parameter File
    "c:\miniconda2\envs\ed\lib\site-packages\smif\data_layer\store.py", line 551, in
    read_narrative_variant                     _data return
    self.data_store.read_narrative_variant_data(key, spec, timestep) File
    "c:\miniconda2\envs\ed\lib\site-packages\smif\data_layer\file\file_data_store.py", line 95,
    in read_nar                     rative_variant_data return self._read_data_array(path,
    spec, timestep) File
    "c:\miniconda2\envs\ed\lib\site-packages\smif\data_layer\file\file_data_store.py", line
    376, in _read_d                     ata_array data_array = DataArray.from_df(spec,
    dataframe) File "c:\miniconda2\envs\ed\lib\site-packages\smif\data_layer\data_array.py",
    line 153, in from_df xr_dataset = dataframe.to_xarray()  # convert to dataset File
    "c:\miniconda2\envs\ed\lib\site-packages\pandas-0.22.0-py3.6-win-amd64.egg\pandas\core\generic.py",
    lin                     e 1690, in to_xarray return xarray.Dataset.from_dataframe(self)
    File "c:\miniconda2\envs\ed\lib\site-packages\xarray\core\dataset.py", line 3092, in
    from_dataframe dataframe = dataframe.reindex(full_idx) File
    "c:\miniconda2\envs\ed\lib\site-packages\pandas-0.22.0-py3.6-win-amd64.egg\pandas\util\_decorators.py",
    line 127, in wrapper return func(*args, **kwargs) File
    "c:\miniconda2\envs\ed\lib\site-packages\pandas-0.22.0-py3.6-win-amd64.egg\pandas\core\frame.py",
    line                      2935, in reindex return super(DataFrame, self).reindex(**kwargs)
    File
    "c:\miniconda2\envs\ed\lib\site-packages\pandas-0.22.0-py3.6-win-amd64.egg\pandas\core\generic.py",
    lin                     e 3023, in reindex fill_value, copy).__finalize__(self) File
    "c:\miniconda2\envs\ed\lib\site-packages\pandas-0.22.0-py3.6-win-amd64.egg\pandas\core\frame.py",
    line                      2870, in _reindex_axes fill_value, limit, tolerance) File
    "c:\miniconda2\envs\ed\lib\site-packages\pandas-0.22.0-py3.6-win-amd64.egg\pandas\core\frame.py",
    line                      2878, in _reindex_index tolerance=tolerance) File
    "c:\miniconda2\envs\ed\lib\site-packages\pandas-0.22.0-py3.6-win-amd64.egg\pandas\core\indexes\multi.py
    ", line 1903, in reindex raise Exception("cannot handle a non-unique multi-index!")
    Exception: cannot handle a non-unique multi-index! Exception: cannot handle a non-unique
    multi-index!

Should be SmifDataMismatch or similar, with message that filename (for narrative, variant, parameter..) had some row (include details) duplicated.

Check also the case where the value might be different:

enduses_service_switch,sector,tech,end_yr,switches_service
rs_lighting,,LED,2030,1.0
rs_lighting,,LED,2030,1.1

is self-contradictory.