spine-tools / Spine-Database-API

Database interface to Spine generic data model
https://www.tools-for-energy-system-modelling.org/
GNU Lesser General Public License v3.0
6 stars 5 forks source link

purge_url does not purge #261

Closed soininen closed 11 months ago

soininen commented 11 months ago

Purging removes data from the cache, not from the database. purge_url() creates a new DatabaseMapping but doesn't load the cache so nothing gets purged.

manuelma commented 11 months ago

Purging removes data from the cache, not from the database

Right, and then when the user commits their changes, the items that were removed from cache by the purge should also be removed from the DB. Is that not the case?

I think there is a more general issue with cascade removal though, doesn't seem to be working. The way I wanted it to work was by fetching descendants in DBCacheBase.dirty_items(), but maybe I missed something.

soininen commented 11 months ago

Right, and then when the user commits their changes, the items that were removed from cache by the purge should also be removed from the DB. Is that not the case?

purge_url() never loads anything into the cache so nothing gets removed. This breaks purging e.g. when done by Connetion during DAG execution.

Is purging in general broken if it removes data from the cache only? Are we sure all items of purged type are in the cache?

manuelma commented 11 months ago

You're dead right, DatabaseMapping.remove_items() assumes that the removed items are in the cache. This is the case when removing from DB editor, because the user cannot remove something they don't see. But it is not the general case at all!

So I think remove_items should fetch too, until it finds the items the user is asking to remove.

But also, maybe it is a mistake to implement purge_items based on remove_items, maybe purge should be its own operation? Like mark a table as purged in the cache, and then delete the entire contents of that table in the DB when committing. This would save enormous time eventually if the user purges a huge table. It is actually almost ridiculous to fetch all the rows just to say, okay, now remove them all.

soininen commented 11 months ago

So I think remove_items should fetch too, until it finds the removed items.

Yes, indeed. I was going to make DatabaseMappingRemoveMixin.purge() fetch all before purging but it does make more sense to fetch in remove_items(). Not all but just enough to get the removal done.

But also, maybe it is a mistake to implement purge_items based on remove_items, maybe purge should be its own operation?

This is such a no-brainer optimisation, right? :) It was the same before the entity changes when we fetched all ids just to say now remove them all.

manuelma commented 11 months ago

it does make more sense to fetch in remove_items()

I am looking at the code and it seems like we already attemp to fetch in there... not sure what's missing then.

soininen commented 11 months ago

I am looking at the code and it seems like we already attemp to fetch in there... not sure what's missing then.

Can you point me to the line where we fetch? I cannot seem find the place.

manuelma commented 11 months ago

First line in db_cache_base._TableCache.remove_item() calls find_item(), and find_item calls fetch_ref()

soininen commented 11 months ago

First line in db_cache_base._TableCache.remove_item() calls find_item(), and find_item calls fetch_ref()

Thanks! The problem is that when Asterisk is passed as id to DatabaseMapping.remove_items() -- as we do when purging -- we replace it by ids in the cache which may not contain all possible ids. So in the case of Asterisk we need to fetch all before we resolve the ids.