PowerGridModel / power-grid-model

Python/C++ library for distribution power system analysis
Mozilla Public License 2.0
147 stars 30 forks source link

[BUG] Data validation when update_data components are not present in input_data #715

Open nitbharambe opened 2 months ago

nitbharambe commented 2 months ago

Describe the bug

When trying to validate using assert_valid_batch_data using update_data with a component that is not present in the input_data, a KeyError is raised.

To Reproduce

from power_grid_model import initialize_array
from power_grid_model.validation import assert_valid_batch_data

input_data = {"node": initialize_array("input", "node", 1)}
update_data = {"sym_load": initialize_array("update", "sym_load", (1,1))}
assert_valid_batch_data(input_data, update_data)

Expected behavior

A clear error message with ValidationError can be given out instead

Screenshots

Error:

Cell In[3], line 3
      1 input_data = {"node": initialize_array("input", "node", 1)}
      2 update_data = {"sym_load": initialize_array("update", "sym_load", (1,1))}
----> 3 assert_valid_batch_data(input_data, update_data)

File z:\zzz\zzz\.venv\Lib\site-packages\power_grid_model\validation\assertions.py:90, in assert_valid_batch_data(input_data, update_data, calculation_type, symmetric)
     60 def assert_valid_batch_data(
     61     input_data: SingleDataset,
     62     update_data: BatchDataset,
     63     calculation_type: Optional[CalculationType] = None,
     64     symmetric: bool = True,
     65 ):
     66     """
     67     The input dataset is validated:
     68 
   (...)
     88         ValidationException: if the contents are invalid.
     89     """
---> 90     validation_errors = validate_batch_data(
     91         input_data=input_data, update_data=update_data, calculation_type=calculation_type, symmetric=symmetric
     92     )
     93     if validation_errors:
     94         raise ValidationException(validation_errors, "update_data")
...
--> 690     invalid = np.isin(data[component]["id"], ref_data[component]["id"], invert=True)
    691     if invalid.any():
    692         ids = data[component]["id"][invalid].flatten().tolist()

KeyError: 'sym_load'
petersalemink95 commented 2 months ago

I agree that throwing a ValidationError here would be necessary

TonyXiang8787 commented 2 months ago

@nitbharambe maybe we need to think about this. Does raising error always be logic?

Users may treat non-existing component as zero-length array for some reasons. Maybe the logic should be: if a component exists in batch dataset but not in input, we only raise error if the width of this batch component array is not zero.

nitbharambe commented 2 months ago

@nitbharambe maybe we need to think about this. Does raising error always be logic?

Users may treat non-existing component as zero-length array for some reasons. Maybe the logic should be: if a component exists in batch dataset but not in input, we only raise error if the width of this batch component array is not zero.

Yes, good point! It's better to cover that situation too.