Changing to elementId from Id causes a more than 6x increase in query traffic

gliwka commented 1 week ago

We've updated our Spring Data Neo4j version from 7.0.0 to the latest version 7.3.4. After the update we've noticed network issues, connection read timeouts and a CPU-saturated AuraDB Enterprise instance on regular repository.findById() query on an entity with many nested relationships.

Further investigation showed, that we suddenly had more than 6x outgoing traffic to our Neo4j instance. A few megabytes per second suddenly turned into 60-80MB/s. That's outgoing traffic (queries) - not the data we fetch, which is much lower.

Enabling logging on the driver level revealed, that the sent queries are indeed multiples larger. The issue seems to be with the queries SDN generates once it wants to fetch a graph with a structure possibly containing cycles, thus SDN falling back on the cascading / N+1 query generation for each level as documented here.

In our case those queries often contain thousands of IDs. Those IDs before (7.0.0) were 64-Bit Longs, occupying only ~8 Bytes - i.e. "2023663". With the introduction of the elementId (i.e. "4:2da1ca6d-a09f-41be-8dff-230d14780c2b:2023663"), which is a string with ~45 bytes on average, the queries are significantly larger since the hundreds of ids are now the large part of the payload. (no. of bytes is just a naive approximation, the bolt protocol level might handle those differently, but the same principle applies).

The bad thing is that all ids for all nodes in our database start with the prefix "4:2da1ca6d-a09f-41be-8dff-230d14780c2b" - so this information is highly redundant in our query.

We could furthermore verify this by changing the Dialect from "Dialect.NEO4J_5" to "Dialect.NEO4J_4" to trigger the logic here switching back to the Id function:

https://github.com/spring-projects/spring-data-neo4j/blob/fef71a289fd00aad94e9febbbe4aaabf3abe2f45/src/main/java/org/springframework/data/neo4j/core/mapping/SpringDataCypherDsl.java#L39-L43

This has the desired effect, our bandwidth usage is now more than 6x lower and the database and network connection is happy again.

Here's one of the more problematic queries: https://gist.github.com/gliwka/96415da55e69429f9c4a5033c54ab124

I understand, that id() is deprecated, however the elementId() in combination with the N+1 query model for potentially circular data models (ours certainly is an DAG, but that's another issue - https://github.com/spring-projects/spring-data-neo4j/issues/2840) and the resulting transfer of hundreds or thousands of elementsIds, that are now multiple times larger, is a bad combination, that does not bring any additional value to us.

For the meantime we've downgraded to the Neo4j 4 dialect and are investigating the use of projections to fetch the data in a different way to be able to upgrade back to Neo4j v5, however we're thankful for any other pointers.

michael-simons commented 1 week ago

Hello. We will answer here in more detail soon, but right now just a couple of things

I hear you in terms of traffic and overhead. It’s terrible, but I can’t change it
id() is deprecated and as such you bring yourself into a terrible situation using the old dialect. We from SDN can’t and won’t give any guarantees how long it will be available in the database still. Neo4j next will soon be out
Another solution is using externally generated UUIDs, SDN provides a generator for that. They can be used as String or UUID on the model. I would go with them. It’s generally the way better idea than using the database physical id

Please be aware that your log will be swamped now with depreciation warnings. You might wanna adapt the level for those notifications https://docs.spring.io/spring-data/neo4j/reference/appendix/logging.html

gliwka commented 1 week ago

Looking forward to it, let me know if any more details would be helpful 😄

We're already using external generated UUIDs on our entities as identifiers.

The id/elementId discussion is just part of the generated query by SDN for those cascading queries whenever we try to load the entity using the finder methods. They're not part of our business logic or entities - there we use exclusively UUIDs. However using our UUIDs for the same purposes would pose the same challenge.

Staying on the older dialect is not a long term solution, just one that allows us to update our Spring Boot Version and also get the fix for cascading updates=false on the Relationships to solve another problem we have with SDN + Optimistic Locking (currently we have a lot of projections in our codebase for the write path to stop cascading updates).

We're trying to migrate to projections for the read path and/or custom queries get those generated queries under control, if no other solution should become available to switch the dialect back to 5 as soon as possible, however we would appreciate another solution.

spring-projects / spring-data-neo4j

Changing to elementId from Id causes a more than 6x increase in query traffic #2957