davebshow / goblin

A Python 3.5 rewrite of the TinkerPop 3 OGM Goblin
Other
93 stars 21 forks source link

traverse using gt vs. equalities, and connect to specific graph #67

Closed John-Boik closed 7 years ago

John-Boik commented 7 years ago

Hi Dave. I'm trying to do a few simple tasks with goblin, but am running into two problems. First, while I can traverse a graph using .has() for equalities, I get errors when I use something like .has('count', P.gt(2)). That error message is:

aiogremlin.exception.GremlinServerError: 500: 500: Value [{operator=gt, other=null, value=2}] is not an instance of the expected data type for property key [num1] and cannot be converted. Expected: class java.lang.Integer, found: class java.util.LinkedHashMap

Second, I'm not sure how to alter my code to connect to a specific JanusGraph graph, 'etrg', for a session, rather than the default graph, 'g'. Previously, I used Goblin.open(translator = GroovyTranslator('etrg'), ...) but it seems that GroovyTranslator has recently been removed from gremlin_python and I'm not sure how to proceed. I want to use the 'etrg' graph with a session.

My test code is:

import asyncio
from goblin import element, Goblin
import goblin
#this fails:  from gremlin_python.process.translator import GroovyTranslator
from gremlin_python.process.traversal import P
from gremlin_python import statics
from goblin import DriverRemoteConnection
from goblin.session import bindprop

# ======================================
def get_hashable_id(val):
  #Use the value "as-is" by default.
  result = val
  if isinstance(val, dict) and "@type" in val and "@value" in val:
    if val["@type"] == "janusgraph:RelationIdentifier":
      result = val["@value"]["value"]
  return result
# ======================================

#translator = GroovyTranslator('etrg')  # previously, this worked. 
# I want to connect the session somehow to ('ws://localhost:8182/gremlin', 'etrg')

# Set up event loop and app
loop = asyncio.get_event_loop()
app = loop.run_until_complete(Goblin.open(loop,
  get_hashable_id=get_hashable_id, translator=None))

# define a vertex class
class Event (goblin.Vertex):
  num1 =  goblin.Property(goblin.Integer)
  def alterNum(self):
    self.num1 += 5
    print("\n  new = {}\n".format(self.num1))

# add a new attribute  
setattr(Event, 'num2', goblin.Property(goblin.Integer))

# Register the models with the app
app.register(Event)

Session = loop.run_until_complete(app.session())

evt = Event()
evt.num1 = 5
evt.num2 = 20

Session.add(evt)
loop.run_until_complete(Session.flush())

# get existing vertex, this works
result = loop.run_until_complete(Session.g.V().hasLabel('event').toList())
for v in result:
  print("    id= {}, num1= {}, num2= {}".format(v.id, v.num1, v.num2))

# transversal with bindprop, this works
bound_name = bindprop(Event, 'num1', 5, binding='v1')
v = loop.run_until_complete(Session.traversal(Event).has(*bound_name).next())
print("\n1:  id= {}, num1= {}, num2= {}".format(v.id, v.num1, v.num2))

# transversal w/o bindprop, on Event vertex, this works
v = loop.run_until_complete(Session.traversal(Event).has('num1', 5).next())
print("\n2:  id= {}, num1= {}, num2= {}".format(v.id, v.num1, v.num2))

# transversal w/o bindprop, any vertex, this works
v = loop.run_until_complete(Session.g.V().has('num1', 5).next())
print("\n3:  id= {}, num1= {}, num2= {}".format(v.id, v.num1, v.num2))

# this causes an exception:
v = loop.run_until_complete(Session.g.V().has(Event.num1, P.gt(2)).next())

# change the value of num1 via the class method, this works
v.alterNum()
loop.run_until_complete(Session.update_vertex(v))
v = loop.run_until_complete(Session.g.V().has('num1', 10).next())
print("\n4:  id= {}, num1= {}, num2= {}".format(v.id, v.num1, v.num2))

# remove all vertex, this works
for v in result:
  loop.run_until_complete(Session.remove_vertex(v))
result = loop.run_until_complete(Session.g.V().hasLabel('event').toList())
print("\nresult2 = ", result)

loop.run_until_complete(app.close())
davebshow commented 7 years ago

I'm travelling today but I can look into this a bit more tomorrow. Groovy translator was removed from gremlin-python some time ago...what version of Goblin are you currently using? What version of gremlin-python? Is aiogremlin installed (dep for newer versions of goblin)?

Regarding the aliasing that you used to do with the groovy translator: translator = GroovyTranslator('etrg')

Here you are creating an alias to 'etrg'. You can do the same with modern gremlin-python based implementations by passing a map of aliases to pretty much any top level object you use with the graph. For goblin, pass these directly to the app object

app = loop.run_until_complete(Goblin.open(loop,
  get_hashable_id=get_hashable_id, aliases={'g': 'etrg'}))

I'll have to fire up my box to check out the other error. I should be able to get back to you first thing tomorrow.

John-Boik commented 7 years ago

Thanks Dave. I'm using goblin 2.0.0, gremlin-python 3.2.5, aiogremlin 3.2.4.

The use of aliases={'g': 'etrg'} worked. Thanks!

One a somewhat different note (I can submit a new issue if you prefer), can you suggest how I would use timestamps with goblin OGM? In the gremlin console, I might use something like: g.addV("User").property("createdDate",System.currentTimeMillis())

and also like:

gremlin> import java.util.concurrent.TimeUnit
gremlin> import com.thinkaurelius.titan.core.attribute.Timestamp
...
gremlin> g = TitanFactory.open("conf/titan-cassandra-es.properties")
==>titangraph[cassandrathrift:[127.0.0.1]]
gremlin> v1 = g.addVertex(null)
==>v[256]
gremlin> v2 = g.addVertex(null)
==>v[512]
gremlin> v1.addEdge("knows", v2)
==>e[dc-74-1lh-e8][256-knows->512]
gremlin> g.commit()
==>null
gremlin> yesterday = System.currentTimeMillis() - 1000 * 60 * 60 * 24
==>1420758191198
gremlin> g.V().bothE().has('$timestamp', Compare.GREATER_THAN_EQUAL, new Timestamp(yesterday, TimeUnit.MILLISECONDS))
==>e[dc-74-1lh-e8][256-knows->512]
==>e[dc-74-1lh-e8][256-knows->512]

For this, the following line is added to janusgraph-cassandra-es.properties:

storage.meta.edgestore.timestamps=true

But how might I do all of this using goblin OGM?

davebshow commented 7 years ago

Yes let's copy that to a new issue as other users may want to do something similar.

John-Boik commented 7 years ago

Dave, I see there is an issue with setting new attributes to a vertex class as I have done. In the code above I used:

class Event (goblin.Vertex):
  # stuff here

# add a new attribute  
  setattr(Event, 'num2', goblin.Property(goblin.Integer))

This works in the initial session in which I set values for the vertex properties num1 and num2, but when I rerun the script to traverse the graph (without adding new vertexes), the transverse returns num1 as an integer, as expected, but num2 is of type goblin.properties.Property object.

I would like to programmatically add properties. I've tried the following, all of which cause problems:

#setattr(Event, 'num2', goblin.Property(goblin.Integer))  # problem described above

# this series results in the error:  AttributeError: 'Property' object has no attribute 'to_db'  
Event.num2 = goblin.Property(goblin.Integer)

Event.__dict__['__mapping__'].ogm_properties.update({'num2': ('num2', goblin.Property(goblin.Integer, db_name='num2'))})

Event.__dict__['__mapping__'].db_properties.update({'num2': ('num2', goblin.Property(goblin.Integer, db_name='num2'))})

Event.__dict__['__properties__'].update({'num2': ('num2', goblin.Property(goblin.Integer, db_name='num2') )})

I'm sure there is an easy way to programmatically add properties to a vertex, but I'm not seeing how its done. Any suggestions?

davebshow commented 7 years ago

Goblin is not designed to have properties added dynamically to model classes. Goblin properties are exchanged for descriptors by a metaclass when the model class is created. It seems to me the point of an OGM is to create consistent mappings between an object and its graph representation, and is therefore inherently less flexible than the pure property graph approach to application design. If you want to add properties on the fly, I suggest that you use "regular" vertices with a GLV library (aiogremlin or gremlin-python), as this will give you much more flexibility.

Haven't gotten to the greater than issue yet, but it is on my list.

John-Boik commented 7 years ago

Hi Dave. If it is of help, here is streamlined code to demonstrate the gt() error. I also tried changing the property type of num1 to string, and then I did not receive the error, but the transversal would not return any vertex. So, that did not work either. I did a search on Google for info on the error, but couldn't find anything to help me make progress. I hope you will be able to make progress. Could the problem be related in some way to using Janusgraph-ES-Cassandra?

import asyncio
from goblin import element, Goblin
import goblin
from gremlin_python.process.traversal import P

from gremlin_python import statics
from goblin.session import bindprop
statics.load_statics(globals())

# ======================================
def get_hashable_id(val):
  #Use the value "as-is" by default.
  result = val
  if isinstance(val, dict) and "@type" in val and "@value" in val:
    if val["@type"] == "janusgraph:RelationIdentifier":
      result = val["@value"]["value"]
  return result
# ======================================

# Set up event loop and app
loop = asyncio.get_event_loop()
app = loop.run_until_complete(Goblin.open(loop,
  get_hashable_id=get_hashable_id))

class Event (goblin.Vertex):
  num1 =  goblin.Property(goblin.Integer)

app.register(Event)
Session = loop.run_until_complete(app.session())

# remove all vertex
result = loop.run_until_complete(Session.g.V().toList())
for v in result:
  loop.run_until_complete(Session.remove_vertex(v))

evt1 = Event()
evt2 = Event()
evt1.num1 = 5
evt2.num1 = 10
Session.add(evt1, evt2)
loop.run_until_complete(Session.flush())

# get existing vertex, this works
result = loop.run_until_complete(Session.g.V().hasLabel('event').toList())
print("\nresult1 = ", result)
for v in result:
  print("\n1:  id= {}, num1= {}".format(v.id, v.num1))

# transversal, this works
v = loop.run_until_complete(Session.g.V().has('num1', 10).next())
if v:
  print("\n2:  id= {}, num1= {}".format(v.id, v.num1))

# this causes an exception:
v = loop.run_until_complete(Session.g.V().has('num1', P.gt(2)).next())
if v:
  print("\n3:  id= {}, num1= {}".format(v.id, v.num1))

loop.run_until_complete(Session.flush())
loop.run_until_complete(app.close())

The error message is:

Traceback (most recent call last):
  File "goblin_test_01b.py", line 62, in <module>
    v = loop.run_until_complete(Session.g.V().has('num1', P.gt(2)).next())
  File "/usr/lib/python3.5/asyncio/base_events.py", line 387, in run_until_complete
    return future.result()
  File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result
    raise self._exception
  File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
    result = coro.send(None)
  File "/home/john/.virtualenvs/ETR/lib/python3.5/site-packages/aiogremlin/gremlin_python/process/traversal.py", line 80, in next
    return await self.__anext__()
  File "/home/john/.virtualenvs/ETR/lib/python3.5/site-packages/aiogremlin/gremlin_python/process/traversal.py", line 46, in __anext__
    self.last_traverser = await self.traversers.__anext__()
  File "/home/john/.virtualenvs/ETR/lib/python3.5/site-packages/aiogremlin/driver/resultset.py", line 66, in __anext__
    msg = await self.one()
  File "/home/john/.virtualenvs/ETR/lib/python3.5/site-packages/aiogremlin/driver/resultset.py", line 16, in wrapper
    "{0}: {1}".format(msg.status_code, msg.message))
aiogremlin.exception.GremlinServerError: 500: 500: Value [{operator=gt, other=null, value=2}] is not an instance of the expected data type for property key [num1] and cannot be converted. Expected: class java.lang.Integer, found: class java.util.LinkedHashMap
davebshow commented 7 years ago

Interestingly, when I ran your code on my box I got a different error, maybe something is wrong with my environment.

Regardless, I believe that this problem is my fault, as the aiogremlin versioning and releases have been off schedule. I can run your original code against both TinkerGraph and Janus graph by changing the gremlin_python objects you use to the ones in included in the version of gremlin_python bundled with aiogremlin 3.2.4:

Change:

from gremlin_python.process.traversal import P
from gremlin_python import statics

to

from aiogremlin.gremlin_python.process.traversal import P
from aiogremlin.gremlin_python import statics

Please make these changes and verify that this fixes the issue for the time being. In the near future I will be able to bring releases and versioning, as well as gremlinpython dependencies, up to date.

John-Boik commented 7 years ago

Great! That solved my problem with the P.gt().

davebshow commented 7 years ago

Well, since the timestamp has its own issue now, I will go ahead and close this. If I missed something feel free to reopen.