neo4jrb / activegraph

An active model wrapper for the Neo4j Graph Database for Ruby.
http://neo4jrb.io
MIT License
1.4k stars 276 forks source link

Chain associations / class methods on QueryProxy #380

Closed cheerfulstoic closed 9 years ago

cheerfulstoic commented 9 years ago

Check out the query_experiment branch. For good examples see the query_spec:

https://github.com/andreasronge/neo4j/blob/query_experiment/spec/e2e/query_spec.rb

The code is a bit hairy right now, but I'm really excited about this approach. It takes over some of the conceptual ground of QuickQuery without the .qq method, it introduces a more ActiveRecordy has_many association, and it allows for calling class methods on QueryProxy objects (which are also now returned from has_many associations)

This is aping a lot of ActiveRecord, but I think it's taking it a step beyond what ActiveRecord is capable of doing. I'm really excited about it ;)

Some things that would need to be done:

cheerfulstoic commented 9 years ago

ATTN: @subvertallchris & @andreasronge

subvertallchris commented 9 years ago

Dude, this is great! RIP QuickQuery. I hope its API was least helpful! Just looked through the specs, I won't have a chance to dig through the code and play around with it until tomorrow. In the meantime, some questions/requests:

Can we specify which object in the chain to return? For instance:

# I'd expect this to return students who match the criteria
othmar.lessons_taught.students.where(age: 16).to_a
# Can we return the lessons?
othmar.lessons_taught.students(:s).where(age: 16).return(:s).to_a

What happens if through is not specified on a has_many relation?

cheerfulstoic commented 9 years ago

QuickQuery was definitely helpful. I feel like I've been mainly taking other people's ideas and tweaking them (ActiveRecord/QuickQuery ;)

I think that's a great idea. Could you give a shot at trying to implement that? I'm not online today, but I have some other thoughts for tomorrow.

Thanks! I'm excited about our progress!

Brian ;p

On Jul 11, 2014, at 22:54, Chris Grigg notifications@github.com wrote:

Dude, this is great! RIP QuickQuery. I hope its API was least helpful! Just looked through the specs, I won't have a chance to dig through the code and play around with it until tomorrow. In the meantime, some questions/requests:

Can we specify which object in the chain to return? For instance:

I'd expect this to return students who match the criteria

othmar.lessons_taught.students.where(age: 16).to_a

Can we return the lessons?

othmar.lessons_taught.students(:s).where(age: 16).return(:s).to_a What happens if through is not specified on a has_many relation?

— Reply to this email directly or view it on GitHub.

andreasronge commented 9 years ago

It looks fantastic. I really like the method chaining and using the declared relationships., like

othmar.lessons_taught.students.where(age: 16).to_a.should == [sandra]
subvertallchris commented 9 years ago

@cheerfulstoic Didn't get a chance to work on it yesterday, I spent some time on those Rails compatibility methods, but I'm gonna check it out today.

cheerfulstoic commented 9 years ago

No problem. I've not had much time either, but hopefully I can get a bit done during the kid's nap ;) I might throw up some specs covering your use case as well as some I've thought of.

Regarding missing :through keys, at first I was thinking that it might make sense to auto-generate a relationship name based on the association, but actually I'm thinking it makes more sense to just do the query without any specified relationship name. That way people could set up associations to any node if they wanted to. I was also thinking it would make sense to provide arrays to the :through key so that you could do -[:REL1|REL2]->. The same could work for the to/from keys to match multiple labels, but it seems like we'd need to resort to a WHERE clause:

https://stackoverflow.com/questions/20003769/how-to-match-with-2-or-more-labels-in-neo4j-2-0-0

cheerfulstoic commented 9 years ago

Oh, and in case you didn't catch it, I introduce the ability to do this in neo4j-core to support association chaining:

  describe 'merging queries' do
    let(:query1) { Neo4j::Core::Query.new.match(p: Person) }
    let(:query2) { Neo4j::Core::Query.new.match(c: :Car) }

    it 'Merging two matches' do
      (query1 & query2).to_cypher.should == 'MATCH (p:`Person`), (c:`Car`)'
    end

    it 'Makes a query that allows further querying' do
      (query1 & query2).match('(p)-[:DRIVES]->(c)').to_cypher.should == 'MATCH (p:`Person`), (c:`Car`), (p)-[:DRIVES]->(c)'
    end
  end
cheerfulstoic commented 9 years ago

I was putting in a spec but then I realized it would be really easy to do, so I did it. So it's there now... ;)

I'm going to write up some more specs and see where I get by the end of naptime

cheerfulstoic commented 9 years ago

Oh, and you can now pluck on QueryProxy objects so that you don't need to query_as in these situations:

othmar.lessons_taught(:lesson).students.where(age: 16).pluck(:lesson)
cheerfulstoic commented 9 years ago

Nevermind, bad spec and overconfidence means it's not implemented ;)

subvertallchris commented 9 years ago

If skipping :through makes it match without a relationship type, wouldn't this happen?

class Student
  include Neo4j::ActiveNode
  has_many :teachers, to: Teacher
  has_many :teachers_who_tried_to_have_them_assassinated, to: Teacher
end

s.teachers == s.teachers_who_tried_to_have_them_assassinated

I'm assuming it would generate MATCH (n:Student)--(n2:Teacher) WHERE ID(n) = 25 RETURN n2, right? Think about how awkward that would be on the first day of class! (Would also be a weird school to begin with, I guess.)

If you want to do a generic match to any class, it seems like you'd leave off the :to key, but you'd still need a relationship type. That's the way it works at the moment.

Maybe I'm in the minority here but I really like the way the gem currently handles relationship naming. By doing Class#rel_type, you can ensure the same relationship type isn't defined in more than one class and it makes things very predictable. I'm certainly not against the option to define the type explicitly, of course. But I am putting in one vote for this behavior to remain intact as an option:

class Student
  include Neo4j::ActiveNode

  has_many :teachers, to: Teacher
  has_many :teachers_who_tried_to_have_them_assassinated, to: Teacher
end

class Teacher
  include Neo4j::ActiveNode

  has_many :students, from: Student, as: :teachers
  has_many :students_who_survived_assassination_attempts, from: Student, as: :teachers_who_tried_to_have_them_assassinated

Although I guess I could achieve the same thing by just doing using :through. Perhaps these could be equivalent:

  #sending class
  has_many :teachers, to: Teacher

  #receiving class
  has_many :students, from: Student, as: :teachers
  #OR
  has_many :students, from: Student, through: 'Student#teachers'
subvertallchris commented 9 years ago

RE: Bad spec and overconfidence... I am so glad I'm not the only one who does that.

cheerfulstoic commented 9 years ago

The problem I see with that is that while I think it makes sense that student.teachers should return all teachers associated with the student, teachers_who_tried_to_have_them_assassinated should have an association specified, like: (s:Student)<-[:TRIED_TO_ASSASINATE]-(t:Teacher)

I'm not sure I know what Class#rel_type does. Can you give an example?

cheerfulstoic commented 9 years ago

My bias is probably from ActiveRecord BTW. I often see (and do) things like this:

class Teachers < ActiveRecord::Base
  has_many :students
  has_many :students_on_whom_assasination_was_attempted, where: "students_teachers.assasination_attempted = 'true'"
end

(That would be on has_and_belongs_to_many I think)

subvertallchris commented 9 years ago

Maybe they should specify an association, but I don't think someone should have to if they don't want to. It's only really useful to know the type if you plan on doing cypher, which might not happen that often in a Rails app.

At the moment, it names relationship types using Class_name#defined_relationship_name. So if you do this:

#class Student
has_one :teacher

rel is Student#teacher

has_n :lessons

rel is Student#lessons

To link it explicitly to another class...

has_n(:lessons).from(Lesson)
#and in the Lesson class...
has_n(:students).from(Student, :lessons)

But the relationship is defined by the sender, so it is Student#lessons

subvertallchris commented 9 years ago

My bias is probably from using Neo4j.rb 2.3 waaay more than ActiveRecord. I might just be attached cause I'm used to it. Not at all saying that what you're suggesting is worse, I'm just really comfortable with my has_n. :-D

subvertallchris commented 9 years ago

I guess my point is really that if I'm specifying a relationship with a dedicated from/to class, I'd expect a dedicated relationship type. Leaving off :through would really just signify that I either don't plan on writing Cypher or am OK using the relationship type that's defined by ActiveNode.

cheerfulstoic commented 9 years ago

Ah, interesting. Personally I think I'd prefer explicitly named relationships to the auto-formatted, but honestly thinking about it now I can see why it could be nice to not have to worry about them... I was actually thinking of auto-generating the names like ActiveRecord does where you singularize/transform capitals/underscores and whatnot, but I hadn't thought of a good way yet. I think I actually like the Class#method format better.

Ok, so what about something like has_many :teachers, through_any: true which would specify a relationship like -->? Or maybe to: :any / from: :any?

cheerfulstoic commented 9 years ago

Gah, sorry, don't know why I thought to: :any / from: :any would work ;)

subvertallchris commented 9 years ago

I'm a big fan of letting things be omitted when there's a possibility for a reasonable assumption about what they want. What about just not specifying a to/from key?

has_many :likes You'd just generate MATCH (n1:User)-[:User#likes]-(n2) RETURN n2.

Or if you want to change the name for some reason...

has_many :likes, through :likes_things MATCH (n1:User)-[:likes_things]-(n2) RETURN n2

That'd get the job done, right? You could go anywhere and you don't need to specify the direction anyway.

cheerfulstoic commented 9 years ago

Yeah, I'm not disagreeing with that, I'm saying sometimes you'll want an association which is through any relationship type. So I think both of your examples are good, but also we should support:

has_many :likes, through_any: true

Which generates:

MATCH (n1:User)--(n2) RETURN n2

subvertallchris commented 9 years ago

Ooh I missed that part! Yes, that is really a fantastic idea!

On Sunday, July 13, 2014, Brian Underwood notifications@github.com wrote:

Yeah, I'm not disagreeing with that, I'm saying sometimes you'll want an association which is through any relationship type. So I think both of your examples are good, but also we should support:

has_many :likes, through_any: true

Which generates:

MATCH (n1:User)--(n2) RETURN n2

— Reply to this email directly or view it on GitHub https://github.com/andreasronge/neo4j/issues/380#issuecomment-48851455.

cheerfulstoic commented 9 years ago

Great, implemented! Naptime's over!

cheerfulstoic commented 9 years ago

Ok, now with somewhat more confidence I can say that mid-association chaining is now possible. Very cool idea and I'm excited to have it ;) I'm not entirely happy with the code, but the specs pass ;)

subvertallchris commented 9 years ago

Hey @cheerfulstoic, Should specs be passing? I'm finally trying to mess around with it but I'm getting some errors when I try to use it.

cheerfulstoic commented 9 years ago

No, the specs aren't currently passing. I was focusing on spec/e2e/query_spec.rb, but I've done a lot of refactoring and I've also added one failing spec on purpose ("allows association with properties"). The other one in query_spec.rb is failing because of a bug, I think. I haven't tried to fix bugs in other spec files yet.

I should actually use this opportunity to run something I've been thinking about by you. You suggested the following:

othmar.lessons_teaching(:lesson).students.where(age: 16).pluck(:lesson)

So that you can refer to the lesson from the middle of the chain. The problem is that we'll also want to limit on relationships I think. What I'm thinking is this:

# Gets lesson nodes
othmar.lessons_teaching.as(:lesson).students.where(age: 16).pluck(:lesson)

# Gets relationships between teachers and lessons.
othmar.lessons_teaching(:r).students.where(age: 16).pluck(:r)

# Filter on relationship between teachers and interests
monster_trucks.interested(intensity: 11).to_set.should == [othmar]
monster_trucks.interested(r: 'r.intensity < 5').to_set.should == [samuels]

These are actually examples from the spec and I believe I was most of the way there to implementing them.

cheerfulstoic commented 9 years ago

The specs should be in a better place now. I fixed most of them. A few spec errors only happen when you run the full suite and are because we used the same classes in quick_query_spec.rb and query_spec.rb ;)

There's one last spec which is failing because it hasn't been implemented yet (query_spec.rb:146 "allows association with properties")

subvertallchris commented 9 years ago

Awesome! Glad they were expected to fail and it wasn't just that I somehow messed up my local branches.

That syntax looks REALLY good. There are a few little things that I'd like to request, though. You may have already considered them and feel free to disagree, this is just my little syntactical wishlist.

To steal directly from the QuickQuery API, I think it would be helpful to pass search parameters in the traversal methods as much as possible. Methods as and where, while good options, feel like they should be optional. (You'll always need a leading as if you want to search on a class and return that starting point, eg. Student.as(:s).lessons.pluck(:s).)

# Gets lesson nodes
othmar.lessons_teaching.as(:lesson).students.where(age: 16).pluck(:lesson)
# OR
othmar.lessons_teaching(:lesson).students(age: 16).pluck(:lesson)

# Gets relationships between teachers and lessons.
othmar.lessons_teaching(:r).students.where(age: 16).pluck(:r)
# OR
othmar.lessons_teaching(:r).students(age: 16).pluck(:r)

# Filter on relationship between teachers and interests
monster_trucks.interested(:user, rel_as: :r).where(r: { intensity: 11 })
#OR
monster_trucks.interested(:user, rel_as: :r).where(:r, 'intensity > 10')
# OR
monster_trucks.as(:m).interested(:user, rel_as: :r, rel_where: { intensity: 11 }).pluck(:m)
# OR...
monster_trucks.interested(:user, rel_as: :r, 'intensity < 5').pluck(:user)

# in other words:
starting_object.as(:first_identifier).declared_rel1(:destination_node_identifier1, node_property1: value, node_property2: value, rel_as: :relationship_identifier, rel_where: { 'rel_property > value' } ).pluck(:relationship_identifier)

The biggest difference is that I think it's more useful for parameters passed to traversal methods to implicitly refer to the destination class properties, not the relationship. Matching criteria for the relationship itself could use rel_where as a key in the method or the where method with an identifier (relationship or node) later on. The reason for this is really just that I think matching based on class properties is a better default than searching based on relationship properties. Every node has properties, but more often than not, the presence of the relationship is enough to perform a basic match.

But as far as I can tell, that's really the only thing I'd like to see changed! Might have more for you once I dig into it, though. ;-) It looks really great.

I meant to get into it over the weekend but I fell into this total feedback loop of working on Rails usability features and just couldn't break away from it. I got a copy of my production data from my Neo4j.rb 2.3 app working in the latest release, so that revealed all sorts of things that needed fixes, tweaks, and review. Lots of performance considerations, gave me a lot to think about. There's a whole other conversation there, though!

cheerfulstoic commented 9 years ago

I think I'm in agreement about using the association's argument to match on the node. I actually hadn't even thought about the possibility of doing Student.as(:s).lessons.pluck(:s) but it makes a lot of sense (I suspect it wouldn't work now, but probably wouldn't take too much).

My head gets a bit dizzy with all the possibilities ;) I was trying to avoid the rel_where call because I didn't like the idea of having two wheres and being able to call the rel_where after the where (see below). I think I'm starting to understand your concerns and have some suggestions:

Want to generate something equivelent to:

MATCH (interest:Interest)<-[rel]-(person) WHERE ID(interest)=123 AND rel.intensity < 5 AND interested.name = 'Ms. Othmar'
# QuickQuery way
monster_trucks.interested(:person).rel_where(r: 'r.intensity < 5').where(name: 'Ms. Othmar')
monster_trucks.interested(:person).where(name: 'Ms. Othmar').rel_where(r: 'r.intensity < 5')

# My idea
monster_trucks.interested(rel: 'rel.intensity < 5').as(:person).where(name: 'Ms. Othmar')

# Other ideas

monster_trucks.interested(:person, r: 'r.intensity < 5').where(name: 'Ms. Othmar')

# Optional node and relationship variables.  Node always defined first
monster_trucks.interested(:person, :r).where('r.intensity < 5', name: 'Ms. Othmar')
monster_trucks.interested(:person, :r).where('r.intensity < 5').where(name: 'Ms. Othmar')

# Optional node and relationship variables.  Use a hash so that any order can be specified
monster_trucks.interested(node: :person, rel: :r).where('r.intensity < 5').where(name: 'Ms. Othmar')

I think my favorite is the second to last (the "Node always defined first"). But I'm also wondering about the call to as immediately on the class level. Seems maybe odd to do that for just the class level and specify the variables for the associations in the arguments to the association method. Maybe this:

Interest.as(:interest).interested.as(:person, :r).where('r.intensity < 5').pluck(:interest, :person)
# or
Interest.as(:interest).interested.as(node: :person, rel: :r).where('r.intensity < 5').pluck(:interest, :person)

Now that I type it out, maybe it wouldn't be so weird to just have as defined on the class and not available through association chaining

subvertallchris commented 9 years ago

Yes! I like this. Your QuickQuery example is wrong, though -- it actually works more like like your example. rel_where is just a parameter of the traversal method, just like rel in the next example. I totally agree, a rel_where method is confusing and unnecessary. The right way to do it in QQ is this:

monster_trucks.interested(:person, name: 'Ms. Othmar', rel_where: 'intensity < 5')
#or... you can't do a comparison with this syntax in QQ, so showing with an exact match
monster_trucks.interested(:person, name: 'Ms. Othmar', rel_as: :r).where(r: { intensity: 5 })
#or
monster_trucks.interested(:person, rel_where: 'intensity < 5').where(name: 'Ms. Othmar')
#or
monster_trucks.interested(:person, rel_where: 'intensity < 5').where(person: { name: 'Ms. Othmar' })

Symbol alone sets the destination node identifier, rel_as hash key to define rel identifier, rel_where key with string value for comparison match, rel_where with hash value to do exact rel property match. Yes, head spins. :-D

I really like your third and fourth-to-last examples best. As odd as it might be to call as once at the beginning, I think calling it multiple times when those same symbols could be passed as params on the traversal method just feels like extra typing. I'd also love to see one little option:

.where('r.intensity < 5')
#or
.where('intensity < 5')

I imagine you're keeping track of the identifiers created as the query is built? If so, you can look at the string and either use the valid identifier or plug one in if it's missing. Something like this:

# "arg" is the string passed, "on_deck_target" is the most recently defined node identifier and what it uses if it can't find a valid identifier, @identifiers is an array of all identifiers set so far
def process_string(arg, on_deck_target)
  @identifiers.include?(arg.split('.').first.to_sym) ? arg : "#{on_deck_target}.#{arg}"
end

As much as I like being set match parameters in the traversal call like in QQ, I feel like the rule being "You define identifiers in the traversal, you specify match parameters with a separate call to where in a separate method" is clean and opinionated in a way that will improve readability by forcing consistent use. I don't know why I generally have an aversion to extra methods, I just tend to gravitate towards "EVERYTHING IS A PARAMETER!" How would ActiveRecord do it?

subvertallchris commented 9 years ago

Oh! And I only clarified the QQ syntax to point out that we're pretty much describing the same thing, sorry if that came off like I was being a dick!

And the more I think about it, I guess that having the gem figure out which identifier the user wants during a comparison is kind of unnecessary and could be confusing. If they're already specifying the identifier earlier in the chain, is it really that hard to remember to use it in your where string? I won't push for that. ;-)

cheerfulstoic commented 9 years ago

Cool, I think we've got a good way forward! ;) I definitely wanted to talk about it before deciding because it'll be hard to change it once it's out there.

About the process string stuff, I've thought about it a number of times, but I really don't want to get into that because to do it right I think we'd need to build a partial cypher parser. There are a lot of cases we wouldn't be able to handle like:

.where('5 > intensity')
# and
.where('intensity > 5 AND intensity < 8')

Allow users to name the relationships/nodes and then letting them use those in the where clauses seems a lot better.

And as far as ActiveRecord goes, it might be something like:

monster_trucks.people.where("interest_people.intensity > 5")
# and

# Use #include to do an SQL join on the association
othmar.lessons_teaching.include(:students).where("students.age = 16")
subvertallchris commented 9 years ago

Go team! This looks great. We'll need to figure something out to protect against Cypher injection, right? I like ActiveRecord's handling of it:

Student.where("age > ?", age)

I've never looked at the code behind that, is it doable here?

On Tuesday, July 15, 2014, Brian Underwood notifications@github.com wrote:

Cool, I think we've got a good way forward! ;) I definitely wanted to talk about it before deciding because it'll be hard to change it once it's out there.

About the process string stuff, I've thought about it a number of times, but I really don't want to get into that because to do it right I think we'd need to build a partial cypher parser. There are a lot of cases we wouldn't be able to handle like:

.where('5 > intensity')# and.where('intensity > 5 AND intensity < 8')

Allow users to name the relationships/nodes and then letting them use those in the where clauses seems a lot better.

And as far as ActiveRecord goes, it might be something like:

monster_trucks.people.where("interest_people.intensity > 5")# and

Use #include to do an SQL join on the associationothmar.lessons_teaching.include(:students).where("students.age = 16")

— Reply to this email directly or view it on GitHub https://github.com/andreasronge/neo4j/issues/380#issuecomment-49102792.

cheerfulstoic commented 9 years ago

Good point. I think it might be best to stick with cypher params, like:

Student.where("age > {age}").params(age: 14)

That way we get the opportunity to take advantage of query caching as well.

subvertallchris commented 9 years ago

Oh yeah, that is a great idea!

cheerfulstoic commented 9 years ago

Ok, I think that's all implemented!

Have a look at the neo4j query_spec.rb file to see if the API looks good to you. I'm pretty excited about it ;)

Also, don't forget to pull your neo4j-core. I made some supporting changes there too.

I feel like the changes in neo4j could use refactoring, but I've got to go and best to just let the glow last for a little while longer ;) Still lots more to do!

subvertallchris commented 9 years ago

Ah! You rule. I will check it out tomorrow for sure. Very excited about this. If that's all working, we can pull Quick Query out. I can update the wiki tomorrow.

If you're still around, can you merge my PR on Core? I don't have access to that one. It's a pretty important bug fix for Java and I need it in there before I can submit merge some changes on this gem.

cheerfulstoic commented 9 years ago

Dig it. I'd definitely like to get this into master and release it so that we can get some feedback. Before that I'd like to finish:

andreasronge commented 9 years ago

Well done, also very excited about this. I've tried the latest and got the following

class Person
   include Neo4j::ActiveNode
   has_many :friends
end  
p = Person.create
p2 = Person.create
p.friends << p2

#=> Neo4j::Core::QueryBuilder::InvalidQueryError: Don't know how to generate a cypher return {:q=>nil}
from /Users/andreasronge/.rvm/gems/ruby-2.1.2/gems/neo4j-core-3.0.0.alpha.16/lib/neo4j-core/query_builder.rb:124:in `cypher_default_return'

Maybe I've done something wrong. I very seldom used prefixed relationship names like Person#friends generated by has_n(:friends).to(Something) because it is less flexible in my neo4j.rb 2.x applications. There is also a problem with inheritance with the relationships if you prefix it with the class name.

subvertallchris commented 9 years ago

You're still running alpha 16 locally! I think he's made changes since alpha 17 that make this work.

I go the other way, I always use has_n(:friends).to(Something) cause I like the predictability of it.

cheerfulstoic commented 9 years ago

Just released an alpha.18 and changed the branch to depend on that. Does that help?

cheerfulstoic commented 9 years ago

FYI, if you specify

has_many :friends

It will create ANY#friends relationships, but when you query on the association it queries without specifying a relationship name. Similarly:

has_many :friends, to: Something

will create Something#friends relationships but not specify a name in the query. It's only if you specify a through key that it queries for that relationship specifically

subvertallchris commented 9 years ago

I still think that behavior is confusing. If you have two declared relationships that go to the same model, they'd both return the same objects unless you specify :through for both of them, right?

cheerfulstoic commented 9 years ago

First off, shall we adopt ActiveRecord's nomenclature of 'associations' to refer to the model-level part?

I think you're right. Here's an example:

class Person
  has_many :friends
  has_many :enemies
end

dr_horrible.friends << moist
dr_horrible.enemies << captain_hammer

dr_horrible.friends # => [moist, captain_hammer]

The reason I did that was for when you have:

class Teacher
  has_many :interests, to: Interest
end
class Student
  has_many :interests, to: Interest
end
class Interest
  has_many :interested
end

danny.interests << reading
bobby.interests << math

math.interested # Should query without specify a relationship (?)

Probably should have both cases in the specs. Going to need to start organizing those specs better soon, too.

cheerfulstoic commented 9 years ago

Also, FYI, I drew the following distinction which might be helpful:

class Model
  has_many :foos # Auto-generate relationship
  has_many :foos, through: false # No relationship name (how do we create here?)
  has_many :foos, through: :food
end

I was also just thinking that we should do like ActiveRecord does and auto-figure the class from the association name if there is no to/from/with ('with' means bidirectional, BTW), but I see two problems with that:

subvertallchris commented 9 years ago

Ahh, I see why you did it that way now. That first example is my exact concern, I've been coming from the Neo4j.rb 2.3 perspective, where your has_n/has_on parameter is the type.

What you were going for makes sense and got me thinking to suggest doing it more ActiveRecord style, where your first param in has_many/has_one is a class constant and then you use :through to specify the type, but it would get messy as you add models. You'd need to define a method for each rel or one catchall that could go anywhere. (Though I guess it's still better than ActiveRecord's migration madness and join tables. ActiveRecord would force you to do all the SQL crap PLUS define the relationships.)

Overall, I think I still really prefer the 2.3 way of handling it. It just gives you the best combination of flexibility and predictability... though it would definitely be nice to change relationship methods, especially if you're using a pre-existing or shared database.

What about a blend of both? has_one/has_many methods with :through key to override the type. Here's a gist with some pseudo-ActiveNode classes that demonstrates how the syntax would let you change the method name and type. It adapts to the included keys, with only the initial :method name being required. It doesn't use ActiveNode at all and fakes two classes, plus some examples to show how they'd behave relative to each other.

https://gist.github.com/subvertallchris/f3d1f28d0e7e15c866de

I think it's really easy to read and follow, should be flexible and easy in use.

subvertallchris commented 9 years ago

This is pretty much in line with your last example, too. I don't have a good answer for specifying direction if to/from is omitted... I think that you'd just need to do direction-agnostic matches when they're used, though I don't know what kind of repercussions that would have, if any.

As for them being polymorphic, wouldn't it only be a major concern if you're trying to chain methods in a query? QueryProxy wouldn't be able to figure out where your next class is and wouldn't give you more traversal methods. In those cases, they'd have to go back to pure Cypher, I guess? Or are there more problematic issues I'm overlooking?

cheerfulstoic commented 9 years ago

I'm still trying to absorb all of this, but I wanted to write down an example of something I've been thinking about. I feel like in saying "to: Model" we're not doing the right thing because the relationship is where we deal with direction. The following potentially takes care of that and also allows us to auto-figure the models based on the association name like ActiveRecord.

Class Teacher
  has_many :lessons # (:Teacher)--(:Lesson)
  has_many :lessons, via: :teaches # (:Teacher)-[:teaches]->(:Lesson)
  has_many :lessons, from: :taught_by # (:Teacher)<-[:taught_by]-(:Lesson)

  # In case we have a SchoolLesson model but want to use 'lessons' as an association name
  # Could be used with via or from as well, this would just override the auto-figuring from the association name
  has_many :lessons, model: SchoolLesson
end
Class Lesson
  has_many :teachers # (:Lesson)--(:Teacher)
  has_many :teachers, from: :teaches # (:Lesson)<-[:teaches]-(:Teacher)
  has_many :teachers, via: :taught_by # (:Lesson)-[:taught_by]->(:Teacher)
end

I feel like our ActiveRecord / Neo4j brains are conflicting here, but that's probably for the best because we want to make both types of people happy ;)

subvertallchris commented 9 years ago

I think I'm getting the hang of what you're doing here. Like you said, more like ActiveRecord. The big difference in our descriptions seems to be a model-first versus method-first approach. Ultimately, what you're describing would totally work for me, since here's what most of my models look like right now:

class Show
  has_n(:bands).to(band)
  has_one(:venue).to(Venue)
end

class Band
  has_n(:shows).from(Show, :bands)
end

class Venue
  has_n(:shows).from(Show, :venue)
end

It would look like this with your syntax:

class Show
  has_many :bands
end

class Band
  has_many :shows
end

class Venue
  has_many :shows
end

By just pointing it at the destination class, you make it a lot simpler. I see what you're doing, I concede. Let's maybe get Andreas to chime in before carving it in stone.

My one request is you find a way to use a string as a relationship type. In the interest of backwards compatibility, I need to be able to set types like "Show#band". People connecting to existing databases might have similar type names.

<% if cypher_n00b %> Is there a performance benefit for using a relationship type in a match? I know that they recommend you always use identifiers for all nodes and relationships to take advantage of Cypher caching, but does the server have to inspect all related nodes to find the right labels? Or does the label take care of that? <% end %>

cheerfulstoic commented 9 years ago

I'm glad you like the syntax (again, I think I'm mostly stealing from ActiveRecord). After our revelation about relationship names in creation vs. querying I'm feeling like there's something missing. I think we're at the point where it needs to be explored in specs ;)

I believe there's some performance benefit to using a relationship type. IIRC once neo4j finds a node, it has a reference to the first relationship in the list and then it has to search the relationships via a doubly-linked-list. So if you specify a relationship there's a possibility of stopping early, rather than following all relationships to find nodes that match the next part of the query. You might want to check out this book if you haven't already (it's free online): http://graphdatabases.com/?_ga=1.219988377.1006053168.1399919982

It has a section on the implementation of neo4j's nodes and relationships on disk.