Closed qstokkink closed 2 months ago
Reconstruction SQL:
SELECT SubjectResource.name, SubjectResource.type, ObjectResource.name, ObjectResource.type, Statement.added_count, Statement.removed_count, Statement.local_operation, Peer.public_key, Peer.added_at, StatementOp.operation, StatementOp.clock, StatementOp.signature, StatementOp.updated_at, StatementOp.auto_generated
FROM StatementOp
INNER JOIN Peer ON StatementOp.peer=Peer.id
INNER JOIN Statement ON StatementOp.statement=Statement.id
INNER JOIN Resource AS SubjectResource ON Statement.subject=SubjectResource.id
INNER JOIN Resource AS ObjectResource ON Statement.object=ObjectResource.id
;
SELECT ChannelNode.infohash, ChannelNode.size, ChannelNode.torrent_date, ChannelNode.tracker_info, ChannelNode.title, ChannelNode.tags, ChannelNode.metadata_type, ChannelNode.reserved_flags, ChannelNode.origin_id, ChannelNode.public_key, ChannelNode.id_, ChannelNode.timestamp, ChannelNode.signature, ChannelNode.added_on, ChannelNode.status, ChannelNode.xxx, ChannelNode.tag_processor_version, TorrentState.seeders, TorrentState.leechers, TorrentState.last_check, TorrentState.self_checked, TorrentState.has_data
FROM ChannelNode
INNER JOIN TorrentState ON ChannelNode.health=TorrentState.rowid
;
Simple reinsert possible (the url
field is UNIQUE
so duplicates are automatically rejected).
SELECT TorrentState.infohash, TrackerState.url
FROM TorrentState_TrackerState
INNER JOIN TorrentState ON TorrentState_TrackerState.torrentstate=TorrentState.rowid
INNER JOIN TrackerState ON TorrentState_TrackerState.trackerstate=TrackerState.rowid
;
Our upgrade script uses
iterdump
to get the scheme and content of an existing database.https://github.com/Tribler/tribler/blob/b8b2563898194b601ddc9d05f92da41a009e8168/src/tribler/upgrade_script.py#L104
Then, if an existing database is found, the lines of the dump are sequentially executed to inject the old data into the new database:
https://github.com/Tribler/tribler/blob/b8b2563898194b601ddc9d05f92da41a009e8168/src/tribler/upgrade_script.py#L110-L114
This does exactly that. However, the content that is included in this dump also includes ids, for example:
This is where the approach becomes less fine: if an id already exists, this imported data gets dropped. However, it gets worse: if another entry that points to this id does get imported in another table, it may point to a completely different object with the same id. I believe this causes errors like #8111.
In short, I think we need to change our approach and reconstruct objects from the existing data. Instead of copying the data, we would create actual new objects (in the orm) from the old data. The downside is that this is much slower than our blind
iterdump
approach.