lilactown / pyramid

A library for storing and querying graph data in Clojure
Eclipse Public License 2.0
228 stars 10 forks source link

Data loss in p/add with nested maps #14

Closed smichal closed 2 years ago

smichal commented 2 years ago

Hi, I've noticed that in some cases pyramid normalisation replaces an entity with a reference but losses data of this entity. Example:

(p/add {} {:a/id 1 :b [{:c {:d/id 1 :d/txt "a"}}]})
=>  #:a{:id
      {1 {:a/id 1, :b [{:c [:d/id 1]}]}}}
lilactown commented 2 years ago

I can reproduce this. The issue is dual pronged, starting with this line in the normalize function

https://github.com/lilactown/pyramid/blob/main/src/pyramid/core.cljc#L93-L94

basically, normalization doesn't descend into collections of maps where the maps are not entities. The map inside of the vector doesn't contain a key whose name is id, which is why it's not considered an entity. This means normalization misses the entity identified by :d/id.

However, we do a second pass to replace entities with their references. This does descend into collections of maps even if they are not entities

https://github.com/lilactown/pyramid/blob/main/src/pyramid/core.cljc#L70-L71

I think what I'd like to do is replace these two functions with a single pass process that both adds entities to the db and replaces them with references, so that we can ensure the way that each collection is visited is the same so that we don't run into issues like this where we lose data.

Thank you for the report! I've added a failing test here: https://github.com/lilactown/pyramid/commit/972ecebbdfa80d73b724168cfb8ec25cdda78fbd

lilactown commented 2 years ago

Fixed in https://github.com/lilactown/pyramid/pull/19