arup-group / genet

Manipulate MATSim networks via a Python API.
MIT License
44 stars 9 forks source link

Formatting Schedule to JSON format fails in case of empty attributes #238

Open KasiaKoz opened 6 months ago

KasiaKoz commented 6 months ago

Saving genet.Schedule to JSON, when some stops have empty/NaN values for attributes that should be strings throws an error when pandas misinterprets the data type (presumably because of the NaNs) https://github.com/arup-group/genet/blob/a846ec27662fa8bf19c457acaf36d3f0a85efb0e/src/genet/utils/graph_operations.py#L289

2024-04-12 14:50:49,552 - Saving Schedule to JSON in /path/to/output_json
Traceback (most recent call last):
  File "/path/generate_schedule_vis.py", line 42, in <module>
    s.write_to_json(os.path.join(output_dir, 'output_json'))
  File "/opt/conda/lib/python3.11/site-packages/genet/schedule_elements.py", line 3697, in write_to_json
    json.dump(self.to_json(), outfile)
              ^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/genet/schedule_elements.py", line 3673, in to_json
    stops = self.stop_attribute_data(keys=stop_keys)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/genet/schedule_elements.py", line 2510, in stop_attribute_data
    return graph_operations.build_attribute_dataframe(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/genet/utils/graph_operations.py", line 289, in build_attribute_dataframe
    col_series = pd.Series(attribute_data, dtype=pd_helpers.get_pandas_dtype(attribute_data))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/pandas/core/series.py", line 475, in __init__
    data, index = self._init_dict(data, index, dtype)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/pandas/core/series.py", line 568, in _init_dict
    s = Series(values, index=keys, dtype=dtype)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/pandas/core/series.py", line 512, in __init__
    data = sanitize_array(data, index, dtype, copy)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/pandas/core/construction.py", line 650, in sanitize_array
    subarr = _try_cast(data, dtype, copy)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/pandas/core/construction.py", line 816, in _try_cast
    subarr = np.array(arr, dtype=dtype, copy=copy)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: could not convert string to float: 'Some/Timezone'