Open porunov opened 4 months ago
Currently I'm stuck on the following two TinkerPop tests:
Error: Failures:
Error: org.apache.tinkerpop.gremlin.process.traversal.step.filter.DropTest$Traversals.g_E_propertiesXweightX_drop
Error: Run 1: DropTest$Traversals>DropTest.g_E_propertiesXweightX_drop:98->DropTest.lambda$g_E_propertiesXweightX_drop$1:98 expected:<0> but was:<1>
Error: Run 2: DropTest$Traversals>DropTest.g_E_propertiesXweightX_drop:98->DropTest.lambda$g_E_propertiesXweightX_drop$1:98 expected:<0> but was:<1>
[INFO]
Error: org.apache.tinkerpop.gremlin.process.traversal.step.filter.DropTest$Traversals.g_V_properties_propertiesXstartTimeX_drop
Error: Run 1: DropTest$Traversals>DropTest.g_V_properties_propertiesXstartTimeX_drop:109->DropTest.lambda$g_V_properties_propertiesXstartTimeX_drop$3:109 expected:<0> but was:<1>
Error: Run 2: DropTest$Traversals>DropTest.g_V_properties_propertiesXstartTimeX_drop:109->DropTest.lambda$g_V_properties_propertiesXstartTimeX_drop$3:109 expected:<0> but was:<2>
The first test checks that drop()
works for Edge properties. The second one checks that drop()
works for meta properties.
For some reason those two TinkerPop tests are failing with this PR. However, I verified that the removal operation works in this PR by adding testMetaPropertiesDrop
and testEdgePropertiesDrop
tests (similarly as in those TinkerPop tests).
It can be seen that these added tests are passing. For now the reason for these two TinkerPop tests to fail is unknown to me.
Additional pair of eyes would me much appreciated.
TinkerPop tests for inmemory database can be triggered with:
mvn clean install --projects janusgraph-inmemory -DskipTests=true --batch-mode --also-make
mvn verify --projects janusgraph-inmemory -Dtest.skip.tp=false -Dtest=org.janusgraph.blueprints.inmemory.process.InMemoryMultiQueryJanusGraphProcessTest -pl janusgraph-inmemory
One of the observations is that if I change the default barrier step size from the default 2500
to 1
(query.batch.limited-size=1
) then all tests passes.
I.e. it means that the fact that we store those properties in the barrier before we actually remove them changes the result of those two tests.
Currently I'm stuck on the following two TinkerPop tests:
Error: Failures: Error: org.apache.tinkerpop.gremlin.process.traversal.step.filter.DropTest$Traversals.g_E_propertiesXweightX_drop Error: Run 1: DropTest$Traversals>DropTest.g_E_propertiesXweightX_drop:98->DropTest.lambda$g_E_propertiesXweightX_drop$1:98 expected:<0> but was:<1> Error: Run 2: DropTest$Traversals>DropTest.g_E_propertiesXweightX_drop:98->DropTest.lambda$g_E_propertiesXweightX_drop$1:98 expected:<0> but was:<1> [INFO] Error: org.apache.tinkerpop.gremlin.process.traversal.step.filter.DropTest$Traversals.g_V_properties_propertiesXstartTimeX_drop Error: Run 1: DropTest$Traversals>DropTest.g_V_properties_propertiesXstartTimeX_drop:109->DropTest.lambda$g_V_properties_propertiesXstartTimeX_drop$3:109 expected:<0> but was:<1> Error: Run 2: DropTest$Traversals>DropTest.g_V_properties_propertiesXstartTimeX_drop:109->DropTest.lambda$g_V_properties_propertiesXstartTimeX_drop$3:109 expected:<0> but was:<2>
The first test checks that
drop()
works for Edge properties. The second one checks thatdrop()
works for meta properties. For some reason those two TinkerPop tests are failing with this PR. However, I verified that the removal operation works in this PR by addingtestMetaPropertiesDrop
andtestEdgePropertiesDrop
tests (similarly as in those TinkerPop tests). It can be seen that these added tests are passing. For now the reason for these two TinkerPop tests to fail is unknown to me. Additional pair of eyes would me much appreciated.TinkerPop tests for inmemory database can be triggered with:
mvn clean install --projects janusgraph-inmemory -DskipTests=true --batch-mode --also-make mvn verify --projects janusgraph-inmemory -Dtest.skip.tp=false -Dtest=org.janusgraph.blueprints.inmemory.process.InMemoryMultiQueryJanusGraphProcessTest -pl janusgraph-inmemory
I noticed, I am not sure if it is related, but you might have missed adding JanusGraphDropStep
here:
https://github.com/JanusGraph/janusgraph/blob/a2e3521d28de37ca01d08166777d77c0943a9db5/janusgraph-backend-testutils/src/main/java/org/janusgraph/graphdb/database/LazyLoadGraphTest.java#L40
I noticed, I am not sure if it is related, but you might have missed adding JanusGraphDropStep here:
JanusGraphDropStep
is replaced as part of JanusGraphLocalQueryOptimizerStrategy
. So I believe it should be fine.
I feels that those two TinkerPop tests don't like that I'm storing properties for removal in the barrier step for some reason.
However, I can't figure out "why" yet.
After a bit of debugging I found the root cause of the problem.
When removing Edge properties I see that each edge property is wrapped into B_O_Traverser
which is passed to NoOpBarrier
step. That traverser is added to the TraverserSet
here.
After I loaded the same graph as in TinkerPop tests locally I noticed that 4 out of 6 edge properties were removed, but 2 didn't because they were considered the "same" properties as were already added into the TraverserSet
(thus their Traverser's merged).
The root cause is that relation
is not checked on equality for SimpleJanusGraphProperty
. This bug was introduced 9 years ago here: https://github.com/JanusGraph/janusgraph/commit/795d7b0131c1e62fdf354e0a8e4718d6627f1418
The fix should be a simple one and I will add a separate test case with barrier()
+ drop()
of edge properties with the same key
and value
.
After a bit of debugging I found the root cause of the problem. When removing Edge properties I see that each edge property is wrapped into
B_O_Traverser
which is passed toNoOpBarrier
step. That traverser is added to theTraverserSet
here. After I loaded the same graph as in TinkerPop tests locally I noticed that 4 out of 6 edge properties were removed, but 2 didn't because they were considered the "same" properties as were already added into theTraverserSet
(thus their Traverser's merged). The root cause is thatrelation
is not checked on equality forSimpleJanusGraphProperty
. This bug was introduced 9 years ago here: 795d7b0The fix should be a simple one and I will add a separate test case with
barrier()
+drop()
of edge properties with the samekey
andvalue
.
Huh. Seems I was a bit wrong. If I fix that drop
tests then dedup
tests start to fail. Hmm. Now I'm confused where the fix should happen.
Seems I'm missing how barrier
step suppose to work. It could be that we actually don't have any bug which I described above and it's actually intended to be like that (at least I verified that the behavior is consistent between TinkerGraph and JanusGraph, so I guess we are fine there).
However, I think it makes sense to double check with the TinkerPop community regrading barrier
and dedup
behavior. Thus, I opened a thread here: https://discord.com/channels/838910279550238720/1241219665376706603
If that is confirmed to be the intended behavior then I will try to think about a bit different solution.
One of the potential solutions is to make out own NoOpBarrierStep
which skips anything except Vertex
and insert that barrier step as part of multi-query optimization instead of the standard NoOpBarrierStep
.
However, there are many places in code logic where we need to check and sometimes cast a step to NoOpBarrierStep
. It would make sense to extend NoOpBarrierStep
and overwrite part of the logic, but NoOpBarrierStep
is currently final
and the necessary properties are private
which makes things a bit challenging.
Potentially, when the following PR is merged and the next TinkerPop version is released it should be easier to extend logic: https://github.com/apache/tinkerpop/pull/2612
Both TinkerPop PRs are now merged (https://github.com/apache/tinkerpop/pull/2612 and https://github.com/apache/tinkerpop/pull/2609). This will allow us to extend NoOpBarrierStep
to skip everything except vertices. Thus should be also an improvement by itself because there is no reason for multi-query strategy to inject barrier
steps which trigger batching for anything except edges because we multi-query step registers Vertices
only as for now, and thus anything else is just a potential waist of operations.
When we replace it with our own barrier step it will allow us:
drop()
step.Now we need to wait for TinkerPop 3.7.3 to proceed without doing major hacks.
The PR is now ready for the review. All tests passed as can be seen from my branch (here).
Notes:
master
branch which makes tp-tests for berkeleydb fail often. This problem appeared after merging the following PR #4425 .
DropStep
drop()
stepBenchmarks:
Conclusion: same as with direct usage of multi-query the usage of Gremlin
drop()
step benefits from re-using multi-query capabilities. Performance ofdrop
query processing improved more than 13 times.TODO:
TinkerPop
3.7.3-SNAPSHOT to 3.7.3 when it's released.Related https://github.com/apache/tinkerpop/pull/2609 Related https://github.com/apache/tinkerpop/pull/2612
Thank you for contributing to JanusGraph!
In order to streamline the review of the contribution we ask you to ensure the following steps have been taken:
For all changes:
master
)?For code changes:
For documentation related changes: