JanusGraph / janusgraph

JanusGraph: an open-source, distributed graph database
https://janusgraph.org
Other
5.28k stars 1.17k forks source link

Range query is not supported for date #1254

Closed pra527vin closed 3 years ago

pra527vin commented 6 years ago

I have been using janusgraph in remote server mode For the data inserted I have following query

Query 1: g.V().hasLabel('NewsDocument').has('publishedDate', between(begin.getTime, end.getTime)) .outE('belongsTo').inV().hasLabel('NewsPaper').has('identifier', 'xyz').inE('belongsTo').outV() .hasLabel('NewsDocument') .unique()

Query 2: g.V().hasLabel('NewsPaper').has('identifier', 'xyz').inE('belongsTo').outV().hasLabel('NewsDocument') .has('publishedDate', between(begin.getTime, end.getTime))

begin.getTime() , end.getTime is long type oof date time Both 'Query 1' and 'Query 2' are equivalent but 'Query 1' is working but 'Query 2' is not working

Query 2 gives error of long cannot be cast to Date error

pluradj commented 6 years ago

Please provide more details on how you defined the data types, and provide some data that shows how your reached the error.

pra527vin commented 6 years ago

This is how I defined data types

mgmt = graph.openManagement() mgmt.makeVertexLabel('NewsPaper').make() mgmt.makeVertexLabel('NewsDocument').make() mgmt.makeEdgeLabel('belongsTo').make() name = mgmt.makePropertyKey('name').dataType(String.class).cardinality(Cardinality.SINGLE).make() publishedDate = mgmt.makePropertyKey('publishedDate').dataType(Date.class).cardinality(Cardinality.SINGLE).make() identifier = mgmt.makePropertyKey('identifier').dataType(String.class).cardinality(Cardinality.SINGLE).make() mgmt.buildIndex('identifierIndex', Vertex.class).addKey(identifier).buildCompositeIndex() mgmt.buildIndex('publishedDateIndex', Vertex.class).addKey(publishedDate).buildCompositeIndex() mgmt.buildIndex('nameIndex', Vertex.class).addKey(name).buildCompositeIndex() mgmt.commit()

This is what my data looks like]

g.addV('NewsDocument').property('identidier','xyz').property('publishedDate',1536838704820).addE('belongsTo').to(addV('NewsPaper').property('identifier','abc').property('name','Kantipur'))

pluradj commented 6 years ago

@pra527vin Thanks for the details. It makes it much easier to reproduce the error when bug reports contain these details.

I ran this in the Gremlin Console:

graph = JanusGraphFactory.open('inmemory')

mgmt = graph.openManagement()
publishedDate = mgmt.makePropertyKey('publishedDate').dataType(Date.class).cardinality(Cardinality.SINGLE).make()
identifier = mgmt.makePropertyKey('identifier').dataType(String.class).cardinality(Cardinality.SINGLE).make()
mgmt.buildIndex('identifierIndex', Vertex.class).addKey(identifier).buildCompositeIndex()
mgmt.commit()

g = graph.traversal()
g.addV().property('identifier','xyz').property('publishedDate',1536838704820).as('doc').
  addV().property('identifier','abc').as('paper').
  addE('belongsTo').from('doc').to('paper')
g.tx().commit()

d = g.V().has('identifier', 'xyz').values('publishedDate').next().getTime()
begin = new Date(d - (1000 * 60 * 60 * 24))
end = new Date(d + (1000 * 60 * 60 * 24))

g.V().has('publishedDate', between(begin, end)) // uses Date
g.V().has('publishedDate', not(between(begin, end))) // uses Date
g.V().has('publishedDate', between(begin.getTime(), end.getTime())) // uses long
g.V().has('publishedDate', not(between(begin.getTime(), end.getTime()))) // uses long

The final query throws this exception:

gremlin> g.V().has('publishedDate', not(between(begin.getTime(), end.getTime()))) // uses long
java.lang.Long cannot be cast to java.util.Date
Type ':help' or ':h' for help.
Display stack trace? [yN]y
java.lang.ClassCastException: java.lang.Long cannot be cast to java.util.Date
    at java.util.Date.compareTo(Date.java:131)
    at org.apache.tinkerpop.gremlin.process.traversal.Compare$5.test(Compare.java:150)
    at org.apache.tinkerpop.gremlin.process.traversal.P.test(P.java:72)
    at org.apache.tinkerpop.gremlin.process.traversal.util.OrP$OrBiPredicate.test(OrP.java:91)
    at org.apache.tinkerpop.gremlin.process.traversal.P.test(P.java:72)
    at org.apache.tinkerpop.gremlin.process.traversal.step.util.HasContainer.testValue(HasContainer.java:118)
    at org.apache.tinkerpop.gremlin.process.traversal.step.util.HasContainer.test(HasContainer.java:94)
    at org.apache.tinkerpop.gremlin.process.traversal.step.util.HasContainer.testAll(HasContainer.java:180)
    at org.apache.tinkerpop.gremlin.process.traversal.step.filter.HasStep.filter(HasStep.java:50)
    at org.apache.tinkerpop.gremlin.process.traversal.step.filter.FilterStep.processNextStart(FilterStep.java:38)
    at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
    at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)

Comparing the explain() for the last 2 traversals, it seems that the HasStep is not able to automatically coerce the long time into a Date object, but the JanusGraphStep does.

Final Traversal                             [JanusGraphStep([],[publishedDate.gte(1536752304820), publishedDate.lt(1536925104820)])]

Final Traversal                             [JanusGraphStep(vertex,[]), HasStep([publishedDate.or(lt(1536752304820), gte(1536925104820))])]
RaulGuo commented 6 years ago

@pluradj I tried you script in my own environment, but no exception is reported. Which version of janusgraph you are using? I am using version 3.0 and linux env, but no exception reported, though no result is returned which is right. gremlin> g.V().has('publishedDate', not(between(begin.getTime(), end.getTime()))) 15:18:24 WARN org.janusgraph.graphdb.transaction.StandardJanusGraphTx - Query requires iterating over all vertices [((publishedDate < Wed Sep 12 19:38:24 CST 2018 OR publishedDate >= Fri Sep 14 19:38:24 CST 2018))]. For better performance, use indexes gremlin>

pra527vin commented 6 years ago

@pluradj I am using janusgraph 0.2.0 Please try this query also: g.V().hasLabel('NewsPaper').has('identifier', 'xyz').inE('belongsTo').outV().hasLabel('NewsDocument') .has('publishedDate', between(begin.getTime, end.getTime))

pluradj commented 5 years ago

This appears to be fixed in the JanusGraph 0.3.x releases, but not in the 0.2.x releases. The specific fix must be getting pulled in via the TinkerPop 3.3.x version.

FlorianHockmann commented 4 years ago

Reopened as this seems to be only partially fixed as reported by @pra527vin in our Gitter chat:

This query is working fine

g.V().hasLabel('NewsDocument').has('publishedDate',gt(1563214500000))

This query is not working

g.V().has('identifier','00ec7d90c8944a84-a2dbe22c09e0d46c').
    in('belongsTo').has('publishedDate',gt(1563214500000)).count()

though I have upgraded from version 0.2.0 to 0.3.2 and 0.4.0 for exising data in solr and cassndra. I found same exception in 0.4.0

stack trace: java.lang.ClassCastException: java.lang.Long cannot be cast to java.util.Date at java.util.Date.compareTo(Unknown Source) at org.apache.tinkerpop.gremlin.process.traversal.Compare$3.test(Compare.java:92) at org.apache.tinkerpop.gremlin.process.traversal.P.test(P.java:72) at org.apache.tinkerpop.gremlin.process.traversal.step.util.HasContainer.testValue(HasContainer.java:118) at org.apache.tinkerpop.gremlin.process.traversal.step.util.HasContainer.test(HasContainer.java:94) at org.apache.tinkerpop.gremlin.process.traversal.step.util.HasContainer.testAll(HasContainer.java:180) at org.apache.tinkerpop.gremlin.process.traversal.step.filter.HasStep.filter(HasStep.java:50) at org.apache.tinkerpop.gremlin.process.traversal.step.filter.FilterStep.processNextStart(FilterStep.java:38) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192) at org.apache.tinkerpop.gremlin.process.traversal.util.TraversalUtil.test(TraversalUtil.java:84) at org.apache.tinkerpop.gremlin.process.traversal.step.filter.TraversalFilterStep.filter(TraversalFilterStep.java:46) at org.apache.tinkerpop.gremlin.process.traversal.step.filter.FilterStep.processNextStart(FilterStep.java:38) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:82) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:112) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192) at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.fill(IteratorUtils.java:62) at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.list(IteratorUtils.java:85) at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.asList(IteratorUtils.java:382) at org.apache.tinkerpop.gremlin.server.handler.HttpGremlinEndpointHandler.lambda$channelRead$1(HttpGremlinEndpointHandler.java:245) at org.apache.tinkerpop.gremlin.util.function.FunctionUtils.lambda$wrapFunction$0(FunctionUtils.java:36) at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:272) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)

mad commented 4 years ago

A simple workaround is just to wrap on new Date(your long value)

lionelfleury commented 3 years ago

Looks more like a schema issue.

property:

created_at_ts | SINGLE | class java.util.Date |

and query:

gremlin> g.V().has('created_at_ts', lte(1622678401000)).count()
==>116

Hence, works here.

/close

FlorianHockmann commented 3 years ago

Thanks @mad and @lionelfleury. Looks like I overlooked that the user who reported the problem on Gitter didn't actually use the Date type but used a long instead. So, this issue shouldn't have been reopened as it was fixed in the 0.3 versions as @pluradj mentioned above.