davebshow / goblin

A Python 3.5 rewrite of the TinkerPop 3 OGM Goblin
Other
93 stars 21 forks source link

Question: Extent to which OO modeling is supported in Goblin #93

Closed aanastasiou closed 6 years ago

aanastasiou commented 6 years ago

I am trying to find an OGM with certain characteristics and I came across Goblin. Goblin requires a considerable "investment" because it means moving away from the current state which is based on Neo4J. I do not mind too much about the actual backend and Goblin is supposed to work with anything that is based on Tinkerpop, so that's an extra check point.

But, I first need to confirm that Goblin can handle a specific modelling case:

  1. The system I am developing can apply a set of algorithms to data that is structured in a specific way with small differences here and there. For this reason, I am setting up a base schema that is specialised (via the use of inheritance) depending on the use case.

  2. The schema looks more or less like this:

    ---BASE--------------------------------------------

    class commonVertexFunc(Vertex): [. . .]

    class Item(commonVertexFunc): someProperty = str() someOtherProperty = str() someRelationship = ItemRelationship(anotherItem, ZeroOrMore) [. . .]

    class ItemRelationship(Edge): [. . .]

    class anotherItem(commonVertexFunc): [. . .]

    ---Specific------------------------------------------------

    class specificItem(Item): specificAdditionalProperty = str()

    class specificOtherItem1(anotherItem): [. . .]

    class specificOtherItem2(anotherItem): [. . .]

  3. Now, what I expect, with this particular setting is: a. specificItem already has a someRelationship. And that should not be too big of a problem.

    b. More importantly, someRelationship will accept ANY object that is of type anotherItem.

Will Goblin be able to handle this?

davebshow commented 6 years ago

Hi @aanastasiou. Here I think the important thing to note is that Goblin has an extremely simple and flexible data model. Goblin does not treat relationships like "properties" of a vertex. Instead, a goblin.element.Edge is a top level element. Out of the box, Goblin does not enforce the type of a source/target vertex when creating an edge, and an edge is in no way bound to any goblin.element.Vertex classes. Instead, edges are completely flexible in that that can accept any vertex as a source or target.

In the case that you want edges to validate vertex type, this is easy to implement by creating a custom __init__ method for your edge class. Furthermore, a graph db like Janus allows you to specify the structural properties of you graph using a schema.

All of that said, inheritance should work normally with Goblin elements. If it does not, I will look into making any necessary fixes.

More info on creating Goblin elements can be found here

aanastasiou commented 6 years ago

Hi @davebshow . Thank you very much for your response.

This looks promising. At the moment, I have Janus up and running with HBase and elasticsearch, a very simple node defined in Goblin and I am simply trying to submit that from python and then query it from gremlin. Unfortunately, I am getting a "Gremlin Server is not configured with a serializer for the requested mime type [application/vnd.gremlin-v3.0+json]" error and at the moment I am battling that one. (* **: Think I may have spotted why)

Do I have to specify the schema beforehand? (I noticed that you are developing a tool to do this (?).)

The point of the OGM is to be able to do things like Person.knows.all() for example. In other words, have the object hierarchy mapped in a "transparent" way between the Python objects and the backend. It is good for the OGM to be lose but not too lose because then it does not enforce validation.

davebshow commented 6 years ago

First of all, the Server Error is most likely due to the gremlinpython version you are using (I assume 3.3.0). The versioning is a bit funky still with these libraries, but it all should be resolved with the 3.2.7/3.3.1 TinkerPop releases. Try this:

$ pip uninstall gremlinpython
$ pip uninstall tornado
$ pip install gremlinpython==3.2.6 --no-deps

About the schema, Janus does not require that the user specify a schema, it will automatically generate any required schema based on input. However, it is strongly recommended that you define a schema for all elements in production. So, for experimentation it isn't required, but yes you will need a schema definition when you decide to use Janus "for real".

I understand that one of the benefits of an OGM is to give some structure to the inherently "loose" graph db datamodel, and Goblin helps provide this through datatype validation and element definition, however, we wanted to retain a lot of the flexibility that comes with a graph db. That in mind, Goblin classes are easily extensible, and you can make them as rigid as you need them to be.

Finally, don't expect any Django style queries (Person.knows.all()) from Goblin. Traversals are written using the aiogremlin GLV. The above query would more like:

await session.g.V().out('knows').toList()
davebshow commented 6 years ago

Also, make sure to use an id hasher, like this: https://github.com/davebshow/goblin-janus-examples/blob/master/examples/app.py#L9-L23

aanastasiou commented 6 years ago

Thank you, this is extremely useful.

davebshow commented 6 years ago

No problem! Good luck and let me know how things go.

davebshow commented 6 years ago

I am going to close this as inactive. Please reopen if you feel it is necessary.