graphsense / graphsense-REST

A REST service for accessing cryptocurrency data stored in Apache Cassandra.
MIT License
10 stars 8 forks source link

Improve robustness of entity tags #64

Closed myrho closed 2 years ago

myrho commented 2 years ago

ISSUE COPIED FROM FORKED REPO

FORMER AUTHOR: myrho CREATED AT: 2021-09-14T10:57:54Z

Since entity tags are tied to the entity id which equals to the smallest internal id of its contained addresses, the entity id will change as soon as another one of its addresses becomes the one with the smallest id. The problem is that the relation to the entity tag is lost then.

There are four use cases affected by this corruption of an entity id change:

  1. given a label, retrieve entity tags (/tags?label={label}), given the entity id of these tags, retrieve the entity
  2. given a local entity tag for an entity id, retrieve the entity for this entity id
  3. /entities/{entity}/tags: given an entity id, retrieve it's tags
  4. given an entity id, retrieve local tags

1 and 2 can be recovered the following way: if the lookup for the given entity id fails (because the entity id of the tagged entity changed), lookup the new entity id by querying the address table for the tag's entity id which is equal to the address id. Hence any address id contained in an entity serves as a reference to the entity, not just the smallest one.

3 can be recovered by reassigning tags to the changed entity ids during the transformation if possible.

4 seems not recoverable I fear. Maybe we could keep a history of entity ids per entity? Users could query their local tags for each historic entity id? Since entity ids might change not very often, this list might grow slowly.

Thoughts (at)graphsense/core-developers ?