spine-tools / Spine-Database-API

Database interface to Spine generic data model
https://www.tools-for-energy-system-modelling.org/
GNU Lesser General Public License v3.0
6 stars 5 forks source link

Make database mapping work when fetching external changes #333

Closed soininen closed 5 months ago

soininen commented 5 months ago

This PR fixes numerous issues related to fetching from a database that has been modified by external source. Most issues were related to situations where the external source modified an existing item that had already been fetched or reused a database id of a fetched but later removed (and removal-committed) item.

Perhaps the most impactful change is that all in-memory items are now given their own temporary id. This circumvents the issues of external changes reusing database ids of in-memory items. Because all items now have temporary ids, the previous TempId class has been replaced by plain negative numbers. Mapped tables hold mappings from item ids to database ids and vice-versa.

When fetching data, we do not "trust" the database ids anymore: we always check the actual values of unique keys when we need to determine if a fetched item is the same one as an in-memory item. Sometimes this means that we need to make explicit database queries to, for example, find the real entity class name of a fetched parameter value alternative. These should be rare corner cases, however.

This PR also implements a very rudimentary merge conflict resolution. By default, in-memory data gets precedence over fetched data as was the case previously. Perhaps one day this functionality can be expanded to more comprehensive merge tool.

See the complementary PR spine-tools/Spine-Toolbox#2477

Fixes spine-tools/Spine-Toolbox#2431

@manuelma I would highly appreciate your feedback on this.

Checklist before merging

codecov-commenter commented 5 months ago

Codecov Report

Attention: 80 lines in your changes are missing coverage. Please review.

Comparison is base (43c1bb6) 77.14% compared to head (89e44fd) 77.31%.

Files Patch % Lines
spinedb_api/mapped_items.py 65.89% 32 Missing and 12 partials :warning:
spinedb_api/conflict_resolution.py 74.24% 16 Missing and 1 partial :warning:
spinedb_api/db_mapping_base.py 92.46% 8 Missing and 7 partials :warning:
spinedb_api/item_id.py 91.66% 2 Missing and 1 partial :warning:
spinedb_api/server_client_helpers.py 0.00% 1 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## 0.8-dev #333 +/- ## =========================================== + Coverage 77.14% 77.31% +0.17% =========================================== Files 75 77 +2 Lines 9239 9602 +363 Branches 1949 2035 +86 =========================================== + Hits 7127 7424 +297 - Misses 1780 1828 +48 - Partials 332 350 +18 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

manuelma commented 5 months ago

I am running into a lot of stuff like the below after this:

Exception in thread Thread-4:
Traceback (most recent call last):
  File "/home/manuel/miniconda3/envs/dev-spinetoolbox/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/manuel/miniconda3/envs/dev-spinetoolbox/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/spine_db_server.py", line 370, in _do_work
    result = handler(*args, **kwargs)
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/spine_db_server.py", line 409, in _do_export_data
    return dict(result=export_data(self._db_map, parse_value=_parse_value, **kwargs))
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/export_functions.py", line 67, in export_data
    "parameter_values": export_parameter_values(db_map, parameter_value_ids, parse_value=parse_value),
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/export_functions.py", line 168, in export_parameter_values
    return sorted(
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/export_functions.py", line 169, in <genexpr>
    (
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/export_functions.py", line 79, in _get_items
    for item in _get_items_from_db_map(db_map, tablename, ids):
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/export_functions.py", line 85, in _get_items_from_db_map
    db_map.fetch_all(tablename)
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/db_mapping.py", line 656, in fetch_all
    self.do_fetch_all(item_type)
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/db_mapping_base.py", line 326, in do_fetch_all
    self.do_fetch_more(item_type, offset=0, limit=None, resolve_conflicts=resolve_conflicts, **kwargs)
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/db_mapping_base.py", line 306, in do_fetch_more
    item, add_status = mapped_table.add_item_from_db(x)
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/db_mapping_base.py", line 595, in add_item_from_db
    item = self._make_and_add_item(item)
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/db_mapping_base.py", line 550, in _make_and_add_item
    item.convert_db_ids_to_item_ids()
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/db_mapping_base.py", line 841, in convert_db_ids_to_item_ids
    self[id_field] = self._find_or_fetch_item_id(item_type, self._item_type, field, self._db_map)
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/db_mapping_base.py", line 859, in _find_or_fetch_item_id
    return db_map.find_item_id(item_type, db_id)
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/db_mapping_base.py", line 237, in find_item_id
    return self.mapped_table(item_type).id_map.item_id(db_id)
  File "/home/manuel/Codes/spine/dev/Spine-Database-API/spinedb_api/item_id.py", line 50, in item_id
    return self._item_id_by_db_id[db_id]
KeyError: 71

Looking at the code (which I probably should have done before when asked), it seems quite complex. Will try to find things out... Any help is appreciated @soininen

soininen commented 5 months ago

@manuelma Do you have a database available that has this behavior? I could also have a look.

soininen commented 5 months ago

Looks like we are fetching parameter values and there is a record that refers to another item (entity class, entity, parameter definition, alternative) that is expected to have been fetched already but actually is not.

soininen commented 5 months ago

I'm not sure why this happens, as we should fetch all items of the given type on lines 858 and 860 in db_mapping_base.py.

manuelma commented 5 months ago

No worries @soininen - will try to reproduce... unfortunately I can't share the data as-is this time.

soininen commented 5 months ago

@manuelma Thanks a lot for the review! Since this PR has been merged already, I will open a new one to address the issues raised.