neo4jrb / activegraph

An active model wrapper for the Neo4j Graph Database for Ruby.
http://neo4jrb.io
MIT License
1.4k stars 277 forks source link

Chain associations / class methods on QueryProxy #380

Closed cheerfulstoic closed 10 years ago

cheerfulstoic commented 10 years ago

Check out the query_experiment branch. For good examples see the query_spec:

https://github.com/andreasronge/neo4j/blob/query_experiment/spec/e2e/query_spec.rb

The code is a bit hairy right now, but I'm really excited about this approach. It takes over some of the conceptual ground of QuickQuery without the .qq method, it introduces a more ActiveRecordy has_many association, and it allows for calling class methods on QueryProxy objects (which are also now returned from has_many associations)

This is aping a lot of ActiveRecord, but I think it's taking it a step beyond what ActiveRecord is capable of doing. I'm really excited about it ;)

Some things that would need to be done:

cheerfulstoic commented 10 years ago

Ok, so I think I've figured out a solution that will please. Below is an example to explore the core problem. The second part is a proposal for how to deal with it. I believe it will work mostly like neo4j 2.x, except the relationships just have #association and not Model#association as the type. That seemed prudent to sidestep with class inheritance, but happy to hear arguments.

class Person
  include Neo4j::ActiveNode

  has_many :enemies, model: Person
  has_many :friends, model: Person
end

class SuperHuman < Person
  include Neo4j::ActiveNode
end

class SuperHero < SuperHuman
  include Neo4j::ActiveNode

  has_many :powers
end

class SuperVillain < SuperHuman
  include Neo4j::ActiveNode

  has_many :powers
end

class Power
  include Neo4j::ActiveNode

  has_many :super_humans
end

spidey = SuperHero.create(name: 'Spiderman')
mj     = Person.create(name: 'Mary Jane')
doc_oc = SuperVillain.create(name: 'Doctor Octavius')

web_slinging = Power.create(name: 'Web Slinging')
spidey_sense = Power.create(name: 'Spidey Sense')
extra_arms   = Power.create(name: 'Extra Arms')

spidey.powers << web_slinging   # CREATE spidey-[:`#powers`]->webSlinging
spidey.powers << spidey_sense   # CREATE spidey-[:`#powers`]->spideySense
doc_oc.powers << extra_arms     # CREATE docOc-[:`#powers`]->extraArms

extra_arms.super_humans.to_a
# MATCH extraArms-[:`#super_humans`]-(result:SuperHuman)  # DOESN'T WORK
# MATCH extraArms-[:`#powers`]-(result:SuperHuman)        # WORKS, SHOULD WE EVEN TRY TO FIGURE THIS OUT?
# MATCH extraArms--(result:SuperHuman)                    # WORKS

spidey.friends << mj          # CREATE spidey-[:`#friends`]->mj
spidey.enemies << doc_oc      # CREATE spidey-[:`#enemies`]->docOc

spidey.friends.to_a
# MATCH spidey-[:`#friends`]-(result:Person)  # WORKS
# MATCH spidey--(result:Person)               # TOO MANY RESULTS

mj.friends.to_a
# MATCH mj-[:`#friends`]-(result:Person)      # WORKS

# Proposal on various forms of has_many

# This declaration implies that we want all associated Name nodes, regardless of relationship type
has_many :names
# CREATE  -[:`#names`]->(:Name)
# MATCH   --(:Name)

has_many :names, from: :bar
# CREATE  <-[:`bar`]-(:Name)
# MATCH   <-[:`bar`]-(:Name)

has_many :names, via: :bar
# CREATE  -[:`bar`]->(:Name)
# MATCH   -[:`bar`]->(:Name)

has_many :names, with: :bar
# CREATE  -[:`bar`]->(:Name)
# MATCH   -[:`bar`]-(:Name)

# Using the :model key changes the label.  Examples:
has_many :names, model: Foo, from: :bar
# CREATE  <-[:`bar`]-(:Foo)
# MATCH   <-[:`bar`]-(:Foo)

# IMPORTANT CASE:
# When we specify model, but not the relationship, we should query on the relationship name based off of the association name
has_many :names, model: Foo
# CREATE  -[:`#names`]->(:Foo)
# MATCH   -[:`#names`]-(:Foo)

# Since the association name resolves to the same thing as the model, consider to be the same as has_many :names
has_many :names, :model: Name
# CREATE  -[:`#names`]->(:Name)
# MATCH   --(:Name)
subvertallchris commented 10 years ago

This looks great! I'm gonna have to look at it closely tomorrow but it looks like it'll cover the bases. Can you use the callbacks I added the other day? Very simple but easy and very functional. I have a gripe with has_one, but that's something we can deal with when you start working on that code.

cheerfulstoic commented 10 years ago

Just pushed changes to support the proposal. I think it really could use some more thorough and organized specing, but I'm going to wait until I hear from both of y'all before carving it in stone, so to speak. Feedback welcome, as always, though I'm offline Saturday

andreasronge commented 10 years ago

I'm back. Great work ! @cheerfulstoic I like your proposal. I like skipping the model part of the relationship type (Model#association). I believe that the most usual thing a developer want to do is to declare an outgoing relationship from one class and connect it with an incoming relationship on another class (see @subvertallchris example above). Our API should make it easy to understand what the resulting graph would look like.

What do you think about the following example ?

class Show
  has_many :bands, dir: :outgoing, type: :bands
  has_one :venue, dir: :outgoing, type: :bands
end

class Band
  has_many :shows, dir: :incoming, type: :bands
end

class Venue
  has_many :shows, dir: :incoming, type: :venue
end

This will also make the API more consistent since we already use these argument names in the neo4j-core api, see Neo4j::Node.rels, e.g. node_a.rels(type: :friends, dir: :outgoing)

Person.has_many :friends, dir: :outgoing, type: :friends
# the above does only allow to create relationship between already existing nodes, since it does not know the Ruby class it should create

Person.has_many :friends, dir: :outgoing, type: :friends, model: Person
# this does allow creating both a new relationship and new Person node, e.g.
Person.friends.build(a_hash)

I think we should also reimplement the outgoing and incoming methods so that we can work on undeclared relationships as well. Neo4j 2.x outgoing method, e.g.

p.outgoing(:friends) << b

@cheerfulstoic It is interesting how the model parameter also is used as a Label in your example. This is nice since we can make sure that the relationship only contains node objects of a certain type (this was not possible in 2.x version). But in 3.x version a model can have any number of labels, so it might not work well in some unusual models having several "label mixins".

How about simply declare that a relationship is only allowed to have a given set of labels ? E.g.

has_many :bands, dir: :outgoing, type: :bands, labels: [:band]

# or let the Band class tell which labels it has
has_many :bands, dir: :outgoing, type: :bands, labels: Band

Maybe this is unnecessary.

The advantage of my proposal is that it is easier to understand if you have a Neo4j background. The disadvantage is that it is harder to understand how the model classes are related (which was better in the 2.x api). Have to do some more thinking ...

andreasronge commented 10 years ago

I've done some more thinking. Since we often want to declare a two way relationship between two classes why not introduce a third class instead of having a circular dependency.

Example

class ShowBand
  include Neo4j::ActiveRelationship

  # not sure about the name of these methods (many_outgoing_on)
  many_outgoing_on Show, :shows
  many_incoming_on Band, :bands # it will create a bands method on the band class

  # we can support several outgoing and incoming classes !

  rel_type :bands

  property :since # relationship property
end

Then we can remove all the has_many from the Show and Band classes. The has_many method can still be implemented in case we don't want to use a ActiveRelationship class. The has_many method is also used by the ActiveRelationship module to generate the has_n methods on the AcitveNode classes.

It also gives as another way to create relationships:

ShowBand.create(Show.create, Band.create, since: 2008)

I don't think we need to use labels in the cypher query when we retrieve all the has_many relationships. E.g. show.bands will simply traverse the relationship bands regardless of what the end nodes are.

subvertallchris commented 10 years ago

Love this. It answers the question of whether Band or Show is responsible for managing relationships between the two, which is something I've struggled with and worked around with separate management classes. I imagine that you'd also be able to wrap relationship objects in this class after the fact to get easy access to getting/setting properties? The CypherResponse object being returned right now has some problems, so something else is needed.

andreasronge commented 10 years ago

I imagine that you'd also be able to wrap relationship objects in this class after the fact to get easy access to getting/setting properties?

Yes, indeed. Maybe even reuse things like validation from the ActiveNode. Not sure if we should support unpersisted nodes and relationships, e.g. ShowBand.new(Show.new, Band.new, since: 2008). If there is a not to complicated solution for this then I think we should implement it (in another github issue).

Callbacks should be supported by just delegating this to the has_many method, which I think you already implemented.

cheerfulstoic commented 10 years ago

I definitely like users being able to understand what the graph will look like, but I don't know that I like the verbosity of specifying the direction and relationship type separately. Why not my syntax of:

class Show
  has_many :bands, via: :bands
  has_one :venue, via: :bands
end

class Band
  has_many :shows, from: :bands
end

class Venue
  has_many :shows,from: :venue
end

I don't know that I'm completely happy with the worlds via / from / with representing outbound / inbound / bidirectional, but I like combining the concepts. Maybe both styles could be supported. I don't know that we need to make the API consistent though because I feel the point of ActiveNode is to provide an ORM abstraction.

I very much like the idea of being able to specify multiple labels and/or relationship types.

It may be my ActiveRecord mindset, but I like has_many better than the ActiveRelationship. I like being able to see a model and understanding it's relationships without needing to search for where it's been referenced in a bunch of files. But maybe you could show an example of when it would be more useful?

Also for the example: ShowBand.create(Show.create, Band.create, since: 2008)

The way that you could do that right now in the query_experiment branch would be:

show.bands.associate(band, since: 2008)

Sorry, I'm not sure I'm thinking this through completely right now, taking a break to write this while the kid chats with the grandparents on Skype ;p

subvertallchris commented 10 years ago

What about a hybrid approach, similar to the relationship method in 2.3? You'd optionally reference the ActiveRelationship (or whatever) class in your models. If it is present, that class handles additional logic beyond the basic node-to-node connection. It also let's you create relations using that class, if you chose. Most importantly (for me, at least ;-) ) you'd have a wrapping class specified if you want to pluck a rel and work with it later.

I do think that it might get confusing if the ActiveRel class was responsible for creating the traversal methods on the ActiveNode classes, so leaving it as an optional management feature might be a better balance.

class Band
  include Neo4j::ActiveNode
  has_n(:admins, via: 'Band#admins', from: User)
  has_many(:shows, class: ShowBand)
end

class ShowBand
  include Neo4j::ActiveRelationship
  validates_presence_of :start_time

  incoming :shows
  outgoing :bands
  type 'Show#bands'

  property :start_time
end
andreasronge commented 10 years ago

I don't understand the has_many :bands, via :bands and it's not obvious which direction it has . If there was a parameter called from then I would assume there was a parameter called to.

I don't think we should hide the neo4j-core api. It could be powerful to work on both undeclared properties and undeclared relationships. For example, to implement search trees where relationship names are the keys.

A questions: why sometimes prefix relationship names with a #, e.g. #names and sometimes without prefix in your examples ?

I'm not sure it is easier to find your way in the code if all the relationship stuff goes into a relationship class, as @subvertallchris mentions and as described in my example, it is just a new idea.

Btw, show.bands.associate(band, since: 2008) should also be supported with my proposal.

@subvertallchris interesting, It is good that something is declared in the model classes so that we get a better overview of how things are related in the graph. But i don't understand the incoming and outgoing, is it for defining the start and end nodes of the relationship object ?

I think that by looking at the ruby class you should understand the name and direction of relationship that are used for creating and traversing nodes, without reading any documentations.

@cheerfulstoic: How about this less verbose example by combining the outgoing and type parameters with to and from parameters ?

class Show
  has_many :bands, to: :bands
end

class Bands
  has_many :bands, from: :bands
end

### Also supported

class Show
  has_many :bands, to: :bands, class: ShowBand
end

class Bands
  # does not have a mapping to ShowBand traversing incoming relationships in this example
  has_many :shows, from: :bands
end

class ShowBand
   property :start_time
end

I think it sometimes can be useful to prefix relationship names, e.g. Person#friends as done in 2.x. It was implemented since we needed a way to distinguish incoming relationships with same name but from different classes. @cheerfulstoic has a solution to this using labels as described in his examples. I think we should support both, since using prefixed relationship can be faster to use/traverse and easier to understand (by looking at the graph), but using labels has other advantages.

Example, using prefix

Similar to 2.x but we instead rely on conventions of naming relationships instead of magic.

class Show
  has_many :bands, to: 'Show#bands'
end

class Bands
  has_many :bands, from: 'Show#bands'
end

Example, using labels

Based on @cheerfulstoic solution

class Show
  has_many :bands, to: :bands, model: :Bands
end

class Bands
  has_many :bands, from: :bands, model: :Show
end

If the model parameter is used then it can be used in the rails build method for creating new model classes.

cheerfulstoic commented 10 years ago

to/from is probably fine, it just seemed that since the hash key is referring to the relationship the "to" is the node at the other end of the relationship whereas "via" refers to the relationship itself. Definition of via: "by way of; through". So to me Show.has_many :bands, via: :has_scheduled seems right.

I also think that when you say has_many :bands, to: :bands, model: :Bands, I'm not sure what the relationship is. It seems to be that the to is specifying the node class. I would expect the relationship type to be to: has_scheduled or even from: :played_at (not sure how to deal with past/present tense here...)

The reason that the relationship names are sometimes prefixed with # is because those relationships aren't specified by the model creator in the has_many association and thus a relationship name is auto-generated by neo4j.rb. That's based on the Model#asociation_name format for relationship types that @subvertallchris mentioned from 2.x, but I thought that removing the Model part would be good to avoid class inheritance issues (though I haven't thought very carefully about that part, to be honest). When specified with a via/from/with in my examples the relationship name given is used.

I'm feeling like there are a lot of things going on in this thread and I'm having trouble feeling like we're accomplishing anything. I wonder if we might try some text chat solution? I've just created a neo4jrb room on freenode, but I'm open to other options.

cheerfulstoic commented 10 years ago

Another thought: What if we took a reasonable complex app (or two) and modeled it using our proposals? I'm not sure I'm completely getting what y'all are saying and I'm not 100% sure you're getting what I'm saying. So I think it might be useful both for communicating our ideas with each other, but I feel like my proposal might change and I face real design challenges.

There seems to be lots of open source rails apps out there with DB models. This one seems relatively compact:

https://github.com/hotsh/rstat.us/tree/master/app/models

But we could also take out a subset of these:

https://github.com/hacketyhack/hackety-hack.com/tree/master/app/models https://github.com/ari/jobsworth/tree/master/app/models https://github.com/discourse/discourse/tree/master/app/models

Also, to be clear I don't think we'll want to model all of the methods in these models, but it would be good to model out the basic models / associations and maybe give examples of how they would be queried or examples of how we might define some of the methods which are most useful.

Thoughts?

andreasronge commented 10 years ago

It is a good idea, but I think we should start writing the documentation first. I do like your proposal. You have some questions in your examples that needs an answer (e.g. WORKS, SHOULD WE EVEN TRY TO FIGURE THIS OUT?). Also, we need better names of the arguments (as you also mentions). I suggest we start writing documentation on a new Wiki page before testing it on a new application.

andreasronge commented 10 years ago

I've started writing on new wiki page. Feel free to edit it or create a new one. https://github.com/andreasronge/neo4j/wiki/Declared-Relationships

cheerfulstoic commented 10 years ago

Seems good, though to be clear I wasn't suggesting trying to create running code, but using a real application's data modeling as fodder for our examples.

The wiki page is definitely a good idea. I'll work there too. Regarding the questions, I actually think those questions are resolved by my proposal. That part of the example was a bit train-of-thought.

andreasronge commented 10 years ago

Ok. the api just feels a bit too magic. But maybe I get use to it and it might be very convenient.

andreasronge commented 10 years ago

Sorry, I think I missed reading some of your previous posts. You do explain how things works !

I don't know that I'm completely happy with the worlds via / from / with representing outbound / inbound / bidirectional, but I like combining the concepts.

How about incoming/outgoing/bidirectional ? It will make it easier remember which is what, I think.

has_many with one parameter

I'm a bit worried about using has_many with just one parameter, example:

has_many :names
# CREATE  -[:`#names`]->(:Name)
# MATCH   --(:Name)

Problem 1)

Performance, as discussed above. But maybe it is enough to warn about this in the documentation and suggest that one should use the via parameter (or whatever it will be called) if you expect that a node will have lots of relationships.

Problem 2)

Unexpected behaviour since it is not obvious that has_many by default is bidirectional.

class TreeNode
   has_many :children, model: TreeNode
   has_one :parent, model: TreeNode, from: '#children' # this is hard to understand
end

parent = TreeNode.create
child = TreeNode.create
parent.children << child
grand_child = TreeNode.create
child.children << grand_child

child.children.to_a # => [parent, grand_child]  !!!

Problem 3)

Declaring matching incoming relationship in the above example is strange (#children) Must be an easier way to do this.

However, I like your other examples when the via and model parameters have been used.

to and from is not a good parameter name

I also think that when you say has_many :bands, to: :bands, model: :Bands, I'm not sure what the relationship is. It seems to be that the to is specifying the node class.

Ok, I agree, to and from is maybe not good names.

Some other ideas

Create several relationships in one go

Why not support creating several relationships

has_many :names, via: [:names, :address], from: :named_by
# CREATE  -[:`#names`]->(:Name)
# CREATE  -[:`#address`]->(:Name)
# MATCH   -[:names|:address]->(:Name)

User defined cypher

has_many :names, match: '-[:my_friends]->(:Friends)', via: :my_friends
# or 
has_many :names, match: [:friends, :knows, :my_friends], via: :my_friends

I don't think we should implement this now, but we can maybe start thinking about it.

andreasronge commented 10 years ago

Btw, @subvertallchris example does not work any longer (it did work before the introduction of prefixing relationship with #)

class Show
  has_many :bands
end

class Band
  has_many :shows
end

class Venue
  has_many :shows
end

Band.create.shows << Show.create 
# This is not same as
Show.create.bands << Band.create

But it works if one uses via and from parameter (as in @cheerfulstoic example). I think it's too easy to get this wrong. I wonder when it is useful with has_many with only one parameter, if ever ?

How about let has_many :friends mean the same as has_many :friends, via: :friends, to: :any ?

Also, I think if there is no model parameter then it will have the default value :any

subvertallchris commented 10 years ago

This was part of the debate Brian and I had when trying to figure out exactly what that first parameter actually means. He proposed that it be more like ActiveRecord, where the first symbol refers to the incoming/outgoing model, not the relationship type. In that implementation, you'd have:

class Show
  has_many :bands
  #MATCH (n1:Show)--(n2:Band) WHERE ID(n1) = 123 RETURN n2
end

I was uneasy with it at first because I'm used to the 2.3 style, where the first param is the relationship type, but I've come around to see it from the ActiveRecord perspective. If an AR user was jumping into Neo4j.rb, I think that's the behavior they would expect. @cheerfulstoic, am I expressing this correctly? Apologies if I am misrepresenting or misinterpreting your position.

I'm not sure how useful this would be on its own. Personally, I'd never use it but I think it would make sense as an option. I want my relationships to have types every time and I will always end up writing this:

class Show
  has_many :bands, via: 'Band#shows'
end

class Band
  has_many :shows, from: 'Band#shows'
  #or whatever the proper way of expressing an inbound relationship of type Band#shows
end

That accomplishes the same thing as:

class Show
  has_n(:shows).to(Band)
end

It's just much more explicit, less magic, and easier to understand.

If you want any any relationship, you'd do:

class User
  has_many :likes, from: 'likes', model: :any
  #or words to that effect
end

Frankly, I'd like it to be even more explicit, from_rel/to_rel, in_rel/out_rel, from_type/to_type, or something like that. It's a few more characters, not as nice looking, but nobody will ever wonder what from_rel means like that might wonder what via or from refers to.

class Show
  has_many :bands, to_rel: 'Show#bands'
end

class Band
  has_many :shows, from_rel: 'Show#bands'
end

class User
  has_many :managed_objects, to_rel: 'User#manages', class: UserManages, model: :any
end
cheerfulstoic commented 10 years ago

I like explicit, but I wouldn't personally use the auto-generation [Model]#association syntax. I get the appeal, though and I was trying to accommodate it. I don't think that people should be using to/from/via/whetever to specify auto-generated names. Much better that if you're going to specify a relationship type in an association that it's done throughout. I think @andreasronge and I have the same preferences toward having complete control (and forcing complete specification) of the relationship types, but I don't entirely dislike the auto-generation (though I get the feeling it might be one of those details you like being swept under the carpet at first and then later regret because you need to deal with it sooner or later. Interested to hear the opinions of @subvertallchris here)

When I was talking about the has_many :bands, to: :bands, model: :Bands example, I wasn't thinking that to/from were bad names as much as I was confused about the bands relationship type being used.

Your TreeNode example is a good one, but I think I've already addressed that case. That was part of my revelation! ;) It's a bit weird, but follow me for a minute: If you specify a model which doesn't match what the association name would have auto-generated, then both creation AND querying use the relationship type (that is, we don't let any relationship type match. Just the auto-generated one). So I believe in your TreeNode example as the code of query_experiment is currently written it would work the way you would hope it would. See this test in query_spec.rb: https://github.com/andreasronge/neo4j/blob/query_experiment/spec/e2e/query_spec.rb#L123

So to give an example of what I mean:

class TreeNode
  has_many :tree_nodes
  # CREATE -[:`#tree_nodes`]->(parent:`TreeNode`)
  # MATCH --(parent:`TreeNode`)

  has_many :tree_nodes, model: TreeNode # Works the same as above
  # CREATE -[:`#tree_nodes`]->(parent:`TreeNode`)
  # MATCH --(parent:`TreeNode`)

  has_many :parents, model: TreeNode
  # CREATE -[:`#parents`]->(parent:`TreeNode`)
  # MATCH -[:`#parents`]-(parent:`TreeNode`)

  has_many :children, model: TreeNode
  # CREATE -[:`#children`]->(parent:`TreeNode`)
  # MATCH -[:`#children`]-(parent:`TreeNode`)
end

Maybe it's somewhat magical, but I think it's the magic that everybody would expect.

andreasronge commented 10 years ago

Interesting. But I don't see how your TreeNode example can work. I understand your rspec that it works. In your example

parent = TreeNode.create
child = TreeNode.create
grand_child = TreeNode.create
parent.children << child  # creates (parent)-[:parent]->(child)
child.parents.to_a # => [] ! since it queries the #children relationship

# Using the tree_nodes does not work either since it returns both parent and children, maybe that is ok,

So how can I declare a parent child tree structure ?

What I would like is specifying what type of relationship it is: outgoing, incoming, or bidirectional. E.g.

class Person
   has_many :friends, bidirectional: :friends  # if you and me are friends then me and you are friends

   has_many :knows, outgoing: :knows  # If I know somebody does not mean he knows me

   has_many :know_by, incoming: :knows  # everybody knowing me
end

Is it possible to rename your parameters via, from and with parameters so it become more obvious ?

andreasronge commented 10 years ago

Thanks @subvertallchris for explaining.

I'm not sure how useful this would be on its own. Personally, I'd never use it but I think it would make sense as an option.

Yes, this is the question, where does it make sense ? If it is not useful then this option should not be possible (or at least not as default when only one argument is given).

andreasronge commented 10 years ago

Maybe the main question is how similar the API should be to active record. I think the graph way of modelling is superior compare active record/RDBMs since using the graph way (by drawing incoming and outgoing relationships on a whiteboard) is more natural for modelling.

subvertallchris commented 10 years ago

I'd say a similarity to ActiveRecord should only be important if the methods and its keys look like AR but behave differently. Most AR users would expect has_many :users, to be an association to class User, and that seems reasonable. But with the right combination of defaults, keys to override those defaults, and documentation to explain those keys, it will all work out.

@cheerfulstoic, I've come around to the idea that relationship type should be defined explicitly by the user whenever possible, but I agree that if it is to be created automatically, it should be predictable. In other words, magic relationship types should match the method used to create or traverse the rel. My use of the old autogeneration syntax in my examples is irrelevant, I was just using them to demonstrate how a user could specify their own types.

I think Brian's most recent examples get the job done, but some simple changes could make it easier and satisfy all concerns.

First, I'd suggest that when you don't specify a relationship type, relationships are created going TO the passed object. That means that for reciprocal relationships, one of your classes must be a bit more specific about what relationship type it expects.

Second, I think we should use from_rel and to_rel to set relationship types. It makes everything extremely clear, I don't see how anyone could ever see from_rel: 'shows' and wonder how it works.

Third, an optional :bidirectional key could make traversals omit a direction and return all objects found over that relationship type. This is independent of the direction set when a relationship is created, and rel creations would still follow the same rule: omit to_rel/from_rel and it defaults to to.

Fourh, passing model: :any could skip the label during traversal and just follows the relationship type, returning all nodes at the other end.

class Band
  has_many :shows
end

class Show
  has_many :bands, from_rel: 'shows'
end

Band.first.shows
# match (n1)-[r1:`shows`]->(n2:Show) where ID(n1) = 1 return n2
Show.first.bands
# match (n1)<-[r1:`shows`]-(n2:Band) where ID(n1) = 2 return n2

class User
  has_many :buddies, to_rel: 'friends', model: User, :bidirectional
  # method is :buddies, the rest is as you'd expect. :bidirectional key makes it omit direction during traversal

  has_many :likes, model: :any, :bidirectional
  has_many :liked_by, model: :any, from_rel: 'likes'
  has_many :things_liked, model: :any, to_rel: 'likes'
end

u1 = User.create
u2 = User.create
u1.buddies << u2
# u1.buddies == u2.buddies because ":bidirectional" makes it omit direction in traversal

u1.likes
# match (n1)-[r1:`likes`]-(n2) where ID(n1) = 1 return n2

u1.liked_by
# match (n1)<-[r1:`likes`]-(n2) where ID(n1) = 1 return n2

u1.things_liked
# match (n1)-[r1:`likes`]->(n2) where ID(n1) = 1 return n2

Throw in an optional :class key to specify the ActiveRel wrapper class and I don't think there's anything you couldn't do.

andreasronge commented 10 years ago

@subvertallchris I really like this ! It removes the magic generated relationships and it is very easy to understand. I like the parameter names, to_rel and from_rel (but maybe there is an even better name). Having bidirectional as a separate parameter is also great since it only says how the relationships are traversed (and not created which does not make sense unless we want to create two relationships one outgoing and one incoming in one go - don't think this is a good idea).

Btw, would it be possible to specify both to_rel and from_rel ? Example User.has_many :buddies, to_rel: 'friends', from_rel: 'friends' to create two relationships ?

I think this will work in inherited classes since they have labels from both super and subclasses.

class Person
  has_many :vehicles
end

class Vehicle
end

# Car nodes has both Car and Vehicle labels 
class Car < Vehicle  
end

Person.vehicles << Car.create << Vehicle  # this will work !
Person.vehicles.to_a # => [car, vehicle objs] This will also work

Are there any other cases that I have missed regarding subclassing ? What about error handling, for example the class does not exist

class Person
   has_many :friends  
end

# There is no Friend class.
# What will happen now. Do we get an error, if so when will we get it ?
subvertallchris commented 10 years ago

I think that if you do has_many :friends while there is no Friend class and do not pass a valid model: key, it should raise an exception of Uninitialized Constant Friend. If someone does both from_rel and to_rel, it should either raise an exception or pick either the first or last declared.

cheerfulstoic commented 10 years ago

Sorry, there's so much stuff to respond to and I feel like I have so little time ;(

You're right, my examples wouldn't work in the TreeNode example. It's a good counter example, though I feel like it's more extreme and could easily be fixed by specifying relationship types. Maybe we could check for the situation and raise an error if there is no way to distinguish the relationships.

I like the idea of choosing one side to be the canonical side and then referencing the association from the other end, but I think that most of the time the association name is going to be different than what you'd want the relationship name to be. For your example:

class Band
  has_many :shows
end

class Show
  has_many :bands, from_rel: 'shows'
end

The relationship is (band)-[:shows]->(show) but I would expect the relationship to be like (band)-[:played_at]->(show) or (band)<-[:hosted]-(show). That's actually the main reason why I liked the prefixed "#" because it made it clear that the relationship was created by an association, so it's not such a big deal that it's not a good wording choice if you had created the relationship outside of ActiveNode

For your other example, if we're talking about requiring a relationship type / direction every time (though I'm still a bit fuzzy on that point), then I think it makes sense to require it as an argument and not as a hash option, though I don't think that you can make a third argument like that after a hash. Maybe something like:

class User
  has_many :buddies, :outbound, 'friends', query_bidirectional: true

  has_many :likes, :bidirectional
  has_many :liked_by, :inbound, :likes, model: :any
  has_many :things_liked, :outbound, :likes, model: :any
end

I guess the thing that I really liked about the has_many :things_liked, via: :likes is that it reads like a sentence, which is a very Ruby-ish thing.

Also, from_rel: 'shows' actually would give me pause. My thought process is something like: "From the relationship? No, that doesn't make sense. From the subject node, or is it 'has many X from the target node'?" I like from/to better for the same reason as above, that it reads more like a sentence.

I feel like I could probably say more, but I want to keep it relatively simple so that this conversation doesn't explode any further. I'm anxiously awaiting being able to be in a timezone where we will overlap more and be able to Skype (though I'd be up for trying to Skype this Sunday if y'all are available)

andreasronge commented 10 years ago

I agree that Band.has_many :shows is a bad choice of relationship name.

@cheerfulstoic Ok, I understand why you wanted has_many with only one argument generate a prefixed relationship. It could be useful if you are only interested in outgoing relationship and will never follow its incoming relationship and that you will never have more than one relationship to the other class (which is possible but will be confusing why some relationship has a name and other not). I rather let the developer type those extra characters in order to avoid future problems. I would also like the developer to think how the two classes are related by giving it a name which I think should be the same as the first parameter of the has_many method.

I think it is a bad idea not specifying the relationship name since it is more important than which class it is related to. The relationship name gives meaning but the model only gives you a safety net (make sure the relationship only contains models of given type) and a convenience method for creating new models.

I'm against via since it does not go together with from and it not a very Neo4j:ish name. I still prefer outgoing and incoming since that is what the Neo4j documentation uses.

Maybe the problem is the name of the method has_many. Has many what ? We have to make it obvious that it is the relationship we are talking about. Maybe has_many is not a good name.

subvertallchris commented 10 years ago

True story: I used playing_shows when I first started working on the app and those extra 8 characters got annoying when I had to type them over and over again, so I dropped it back to shows and decided to keep all of my relationship names as concise as possible. That particular relationship type only feels awkward if you're doing a lot of cypher, which might not happen in a Rails app as often as you think. I think there's one pure Cypher query in my app, everything else is done through declared rels in the models, so this worked for me. That's the strength of the API:

class Show
  has_many :bands, from_rel: 'playing_shows'
end

class Band
  has_many :shows, to_rel: 'playing_shows'
end

Best of both worlds. In Rails, band.shows.each; in Cypher, -[:playing_shows]->(n2:Show). If someone choses to omit the type declaration, that's on them, they end up with a crappy type for their Cypher matches. If it's something you're really worried about, we can make a type declaration required and give best practices for naming.

from_rel and to_rel is pretty clear to me but it's totally ugly. inbound and outbound are much better and convey the same thing. I'd prefer them like this:

class User
  has_many :buddies, outbound: 'friends', query_bidirectional: true

  has_many :likes, :bidirectional
  has_many :liked_by, inbound: 'likes', model: :any
  has_many :things_liked, outbound: 'likes', model: :any
end

But that just feels cleaner to me and doesn't really matter all that much.

@andreasronge, if you want to get away from has_many, we could be more relationship-centric and differentiate the syntax from ActiveRecord by using inbound or outbound as the methods. Using the above as an example, you could achieve the results with:

class User
  outbound :buddies, type: 'friends', query_bidirectional: true

  outbound :likes, query_bidirectional: true
  inbound :liked_by, type: 'likes', model: :any
  outbound :things_liked, type: 'likes', model: :any
end

class Band
  outbound :playing_shows, model: Show
  #type defaults to "playing_shows"
end

I actually like this better, it feels more appropriate for Neo4j relationships and reads nicely.

subvertallchris commented 10 years ago

Since the type would match the method by default, you could declare the type first and then use as to specify the method:

class Band
  outbound :playing_shows, as: :shows, model: Show
  #synonymous with...
  outbound :shows, type: 'playing_shows', model: Show
end

#gives you
band.shows.each
# (n1)-[r1:`playing_shows`]->(n2:Show) return n2 where...

as could also be method, that might be a bit more explicit but it doesn't read as nicely. In fact, if you really want things to look nice, you make model, to, and from all do the same thing. Feels very Ruby. You could end up with:

class Band
  outbound :playing_shows, as: :show, to: Show
  # synonymous with...
  outbound :playing_shows, method: :show, model: Show
end

class Show
  inbound :bands, type: 'playing_shows', from: Band
end

But all from and to really do is declare a label to be used for matching and make it nicer to read, direction is controlled 100% by the method called.

Using method as a parameter feels like a forbidden word. Is that even allowed? Just spitting ideas out..

cheerfulstoic commented 10 years ago

So to clarify again I think we should be clear about using the words association vs. relationship (type). In has_many :bands I think that bands makes for a bad relationship type but a great association name. I think Chris also put it well regarding this issue.

I'm down for playing with the names and I think the inbound / outbound idea brings up a question I've had in my mind: Do we want has_one-type functionality? In neo4j everything is has_many-ish, so maybe people should depend on the first method. I was avoiding thinking about how to deal with has_one in an association chaining context, but if it's all one-to-many then that makes things a lot easier.

I really like the has_many, though (again because of the sentence-like structure of it). And I also really like making the method that you're calling required as the first argument (I want to be able to see what methods I can call by scanning down the model file). What about this?

class Band
  has_many_outbound :playing_shows, type: :playing
end

class Show
  has_many_inbound :bands, type: :playing
end

Of course there would be a has_many_bidirectional too. Though thinking about it, I don't like the fact that the first arguments don't like up nicely because the methods aren't the same length. Minor concern, I suppose, but again it is nice to be able to scan through your models. Another idea:

class Show
  inbound do
    has_many :bands, type: :playing

    # ...or, if we're going to always require the relationship type
    has_many :bands, :playing
  end
end
cheerfulstoic commented 10 years ago

Also, honestly, I keep thinking about how I really like my original proposal. If you say has_many :bands and don't specify a relationship or a direction, that's a really straightforward way to say that you want to query bidirectionally. If you don't specify a model and the association name doesn't match up with a model, then it's fine to just match on any model. If users of ActiveNode get the wrong results, they'll realize it quickly and learn that in those cases they need to specify a relationship type/direction/model when they want to be specific.

cheerfulstoic commented 10 years ago

(though I'm still totally happy to require relationship types, but I don't think you should have to specify direction / model)

subvertallchris commented 10 years ago

Definitely a good point about being careful about associations VS relationships! I'll keep that in mind.

has_one has always been very helpful for me. I scanned through my projects and found that I probably use it about 75% of the time, though the User model always seems to be mostly has_many, for whatever reason. I'd prefer to not rely on first but I certainly won't push if it's for the greater good! It is sort of... unnatural in a Neo4j context, huh?

andreasronge commented 10 years ago

Got some inspiration from your posts.

Why not make it similar to the property class method, and the core api ?

Similar to @subvertallchris outbound and @cheerfulstoic has_many_outbound methods above.

class Person
  rels :knows   # same as rels :knows, type: :knows, dir: :outgoing, to: :any

  # or specifying which class as well as direction
  rels :knows, to: self  # (or Person)
  # same as rels :knows, type: :knows, dir: :outgoing, node_class: :Person

  rels :known_by, from: :Person, type: :knows

  # for single relationship, which the Java api support directly
  rel :best_friend

  property :name
end

Or maybe we should call it relationships and relationship ?

andreasronge commented 10 years ago

@cheerfulstoic regarding the original has_many proposal. Maybe if I can see an example application using it and compare it with for example the rels proposal above, I would be more positive about it. It is maybe a bit too clever for my taste and tries to hide away the fact that we are using a graph database.

subvertallchris commented 10 years ago

So! Brian and I just did a video chat to see if we could iron out some details. Hope you don't mind, we just wanted to get on the same page with since there are so many little details, a conversation seemed like the best idea. Here is our grand proposal! (All examples apply to has_one, but has_many is used.)

has_many makes sense as a method

ActiveRecord uses it, Mongoid uses it, and it's a sensible way of envisioning related objects. But...

...all has_many/has_one methods begin with declaration of direction

Those can be:

:bidirectional comes with the caveat that the relationship created will be outgoing but matches will omit the direction.

Direction is too important to overlook because Neo4j relationships are so different from ActiveRecord, it should be at the forefront.

class Show
  has_many :incoming, :bands
end

Second parameter is a symbol that sets the method and defaults: relationship type and expected object class

class Show
  has_many :incoming, :bands
end
Automatically generated relationship types are prefixed with "#"

In the above example, the type is "#bands". The reason it is that and not just "bands" is because it acts as a reminder that this relationship type was created by ActiveNode. In practice, the idea is that if you come across this relationship through a match, you don't have to wonder what it is or where it came from, since you will not have specified it explicitly anywhere. It signifies that it was automatically generated.

Polymorphic matches and overriding the model

By default, the second parameter tells the association to look for a class of that name. Use the model option to specify a target.

class Show
  has_many :incoming, :bands
  # looks for Band

  has_many :incoming, :people_playing_music
  # Uninitialized Constant PeoplePlayingMusic

  has_many :incoming, :people_playing_music, model: Band
  # overrides default
end

To define a polymorphic association, pass model boolean false.

class User
  has_many :outgoing, :managed_objects, model: false
end

class Show
  has_many :incoming, :admins, type: '#managed_objects', model: User
end

class Band
  has_many :incoming, :admins, type: '#managed_objects', model: User
end

Declaring relationship type

By default, it bases relation type off of the symbol passed. User "type" to declare a specific type.

class Show
  has_many :outgoing, :people_playing_instruments, type: 'performing_bands', model: Band
end

ActiveRel wrapped classes

Use the relationship_class keyword to specify a class wrapper. This could do validations and callbacks, act as a wrapper for later retrieval.

class Show
  has_many :incoming, :bands, relationship_class: ShowStatus
end

I think that covers it. It hits all bases: easy to use, easy to read, familiar for ActiveRecord but not so identical that it makes Neo4j a second-class citizen. It has logical defaults that are all easy to override.

All documentation would encourage the use of the "type" option. The prefixing of relationship types with "#" is just a default to keep things simple, but I think we should always encourage users to declare types.

What do you think?

subvertallchris commented 10 years ago

Forgot one example, just added it! (Mentioning this in case anyone's reading through email, you won't be notified of the change.)

cheerfulstoic commented 10 years ago

^ My name is @cheerfulstoic and I approve this message ^

andreasronge commented 10 years ago

Looks very good. I guess I just have to get used to it. It does have some advantages over my proposed rels above (which also has been updated), but also some disadvantages.

What I don't like is that one has to specify the incoming relationship type with a hash, e.g. Show.has_many :incoming, :admins, type: '#managed_objects', model: User. But I guess as you say we should try to encourage people to use the type parameter. It feels like the hash generated relationship is something that we should try to not expose to the user, unless she wants to do some advanced stuff e.g. writing cypher queries without the DSL. In 2.x I also generated the relationship type, but the developer did not need to know the name of the generated relationship in order to create an incoming relationship (with has_n(known_by).from(Person, :knows))

What I do like about it is that one work with end node model classes more than thinking about the relationship itself, which probably is what most people want to do.

I also like that it is a less magical than the first proposed version.

A question: how should we can access the relationships ? In 2.x I simply generated a postfixed method _rels which will instead return relationship objects.

A detail: I don't like the parameter name model and relationship_class because they are not consistent. Either model_class and relationship_class or just model and relationship or something else.

How about your proposed has_many method simply delegates to my rels method above which generates the method ? The rels class method could simply delegates to the rels and nodes method of neo4j-core with almost the same arguments. That means that very little documentation is needed for the rels methods.

andreasronge commented 10 years ago

Is it possible to fix so that it is not needed to prefix the type of incoming relationship with '#' ?

subvertallchris commented 10 years ago

Awesome! Great points.

If we want to hide types from the user, we could add a ref method that takes the association name specified in the reciprocal class.

class Show
  has_many :outgoing, :bands
  has_one  :outgoing, :headliner, model_class: Band, type: 'headlining_band'
end

class Band
  has_many :incoming, :shows, ref: :bands
  # interpreted as, "refer to `bands` association on model `Show` to determine type"

  # or if you want a different method and an explicit model...
  has_many :incoming, :things_where_music_is_played, model_class: Show, ref: :bands

  has_n :incoming, :headlining_slots, model_class: Show, ref: :headliner
  # this would be useful even when you are defining a type on one side
end

In my examples, I specify the type in the outgoing side but it wouldn't need to be that way. Presence of type enforces type, presence of ref looks to class and association specified to determine the type.

Two big benefits to this:

First, you'll be able to look at some models and know that the relationships will work as expected from either model. The presence of a ref option and lack of exception error will indicate that your relationship is setup correctly.

Second, you change the nature of the dependency between models in a good way. You no longer have to remember that changing the relationship type in Show, requires an update in Band. Band still depends on Show but at least changing the association name in Show would result in an Exception error in Band. By comparison, changing the type would not be truly broken, so no error would be raised. This would hide the magic of automatic relationship types, which I think has pros and cons, but this would make all reciprocal relationships easier to setup and configure.

Does that work for you?

I agree about the whole relationship_class/model thing. Explicit is always better than implicit, so I vote rel_class (I think the abbreviation should be fine?) and model_class.

I don't exactly understand what you have in mind about delegation to rels, but I think that's a shortcoming of my own experience with Ruby. :-) @cheerfulstoic has been the one implementing these changes, so he can probably give feedback about that.

subvertallchris commented 10 years ago

ref is a pretty crappy keyword. origin might be nice?

cheerfulstoic commented 10 years ago

I like rel_class, model_class, and origin.

I'm not familiar with the rels method. Could you point me to a file/line?

One thing that I don't think @subvertallchris covered is that to access the relationship object you could do this:

band.shows(:show, :show_band).pluck(:show_band)

The idea is that when you're doing association chaining in order to give a variable to the end node you give a first argument, in order to give a variable to the relationship you give a second argument.

I also have another proposal. I know this thread has gone on long enough so feel free to shoot it down, but a thought I had on the train this morning. First a quick ActiveRecord refresher:

class Post
  has_many :comments
  has_one :comment
end

class Comment
  belongs_to :post
end

Is this example the comments table has a post_id column. The has_one association behaves like belongs_to in the sense that it just returns one object, but it's actually just getting the first object from the list of potential comments (and probably deleting any other comments when you say post.comment = Comment.create

So I dig the need for has_one in ActiveNode, but it has_one has always felt a little weird to me and I've avoided it when I can (though I still definitely find uses for it)

What about the following for ActiveNode (equivalent in current proposal in comment)

has_one :comment # has_one :outbound, :comment
belongs_to :comment # has_one :inbound, :comment
relates_to :comment # has_one :bidirectional, :comment

has_many :comments # has_many :outbound, :comments
belongs_to_many :comments # has_many :inbound, :comments
relates_to_many :comments # has_many :bidirectional, :comments

To make it more consistent we might change has_one to just has

Also BTW, I've implemented the has_many , , syntax on the query_experiment branch

cheerfulstoic commented 10 years ago

I also just realized that we could alternatively go with has_one, belongs_to_one, relates_to_one

Really, I'm just trying to get this thread to 100 comments ;)

andreasronge commented 10 years ago

Its really gets better and better. Sorry, I'm in a rush, so I just scanned your posts. I have also today thought of belongs_to and relates_to.

I guess it is also related to cascade delete ? When you deletes post then all its comments should be deleted as well if it is belongs_to the post. If it only relates to the post then it is not deleted.

band.shows(:show, :show_band).pluck(:show_band)

Beautiful !

ref is a pretty crappy keyword. origin

Really great. I think I prefer origin.

I'm not familiar with the rels method. Could you point me to a file/line?

http://www.rubydoc.info/github/andreasronge/neo4j-core/Neo4j/Node#rels-instance_method

(Regarding the rels delegation it is an implementation detail that maybe that could be useful for end users as well, sort of two layer architecture - not 100% sure it is a good idea - need to play around with the code I guess)

has_one

We must have it - it is really useful. It is even supported in the neo4j api directly (Java API).

Btw, we should probably also support unique (maybe for 3.1 release) - having only unique relationships.

cheerfulstoic commented 10 years ago

I agree that has_one is useful! No more dissent from me there ;)

But I want to make sure you understood my suggestion that we not use inbound, outbound, bidirection anymore and instead use belongs_to_(one|many) for inbound, has_(one|many) for outbound, and relates_to_(one|many) bidirectional.

Also curious to hear @subvertallchris `s view

subvertallchris commented 10 years ago

It certainly reads better and feels like a great way of expressing what's happening, but I also like just having two very flexible method over six specialized methods. Master has_many/has_one methods offer a certain directness that appeals to me but I see why they wouldn't appeal to everyone.

Could we compromise by offering has_(many|one) with direction for dinosaurs like me, the friendlier methods for people who are more comfortable with the abstraction?

On Thursday, July 24, 2014, Brian Underwood notifications@github.com wrote:

I agree that has_one is useful! No more dissent from me there ;)

But I want to make sure you understood my suggestion that we not use inbound, outbound, bidirection anymore and instead use belongsto(one|many) for inbound, has_(one|many) for outbound, and relatesto(one|many) bidirectional.

Also curious to hear @subvertallchris https://github.com/subvertallchris `s view

— Reply to this email directly or view it on GitHub https://github.com/andreasronge/neo4j/issues/380#issuecomment-50061826.

subvertallchris commented 10 years ago

Err... I sent that from my phone and then edited without realizing that you want to still use has_(many|one) but change the meaning. I'm not so sure about that... but I won't object if you guys outvote me. I can adapt, I just like the directness of defining direction with two master methods. Feels like less of a learning curve, too.

cheerfulstoic commented 10 years ago

Yeah, let's just keep has_(one|many) :(inbound|outbound|bidirectional) It has a major advantage that it's already implemented ;)

I might implement (has|belongs_to|relatesto)(one|many) in my own projects to try it out, though ;)

P.S.: 100 comments! This page is getting slower and slower for me. Y'all too?