rails-api / active_model_serializers

ActiveModel::Serializer implementation and Rails hooks
MIT License
5.33k stars 1.39k forks source link

[RFC]Thoughts on an explicit way to define nested relationships? #1113

Closed NullVoxPopuli closed 9 years ago

NullVoxPopuli commented 9 years ago

I've been working on sideloading (#1107) for the json adapter, and needed recursive relationship building of the json object.

The problem with doing a strictly recursive approach (independent of sideloading or not), is that if you have enough references on your objects to other objects, a single request (of any kind) is basically going to return the whole database. Example: Post belongs_to an Author which has_many Posts, and each Post has_many Comments, and each Comment belongs_to an Author.... it and would just keep going. The code in #1107 at least doesn't allow for things to be serialized multiple times (like if a post has the same author, that author only gets serialized once).

So, a potential solution that I'm working on prototyping right now is something like this:

PostSerializer < #...
  has_many :comments, include: :author
end

So, instead of recursively serializing the comments, we'd serialize them like we do today, but because we have the author key in the include hash, we'd also serialize each of the coments' authors.

BlogSerializer < #...
  has_many :posts, include: { author: [:bio], comments: [:author]] }
end

resulting json may look like:

{
  blog: {
    posts: [
      {
        author: {
          bio: {...}
        },
        comments: [
          {  author: {...} }
        ]
      }
    ]
  }
}

or

AuthorSerializer < #...
  has_many :blogs, include: { posts: [:author, comments: :author] }
end

resulting json may look like:

{
  author: {
    blogs: [
      {
        posts: [
          {
            author: {..}
            comments: [
              {
                author: {..}
              }
            ]
          }
        ]
      }
    ]
  }
}

this gets in to a problem of potentially having redundant data in the resulting json, so caching would be super important here (which I think is implemented by default? or does a cache key need to be specified in the controller? if not, should there be a default? - or would this kind of cacheing be separate, and more like fancy memoization?)

Note, I found that there was a previous attempt at solving this problem here: #952, but I think it's not clear in #952 how deep the nested relationships go / it could imply that the same recursive problem exists.

iMacTia commented 9 years ago

Hi @NullVoxPopuli, just to answer your last question (I'm the author of #952) the nesting can actually bring the recursive problem, but that's intended. It is actually a very similar feature to your side-loading, as you're actually asking to AMS to side-load all associations without explicitly declaring them. That's allow you to keep your code easier to maintain (if you add a new relationship later, that will be taken automatically). The only situation a recursive loop could happen is when you have 2 serialisers like this:

TeamSerializer < #...
  has_many :players, include_nested_associations: true
end

PlayerSerializer < #...
  has_many :teams, include_nested_associations: true
end

This can actually fire the recursive problem, actually all many-to-many associations share this risk. But again, that's intended, because it depends on how you use the include_nested_associations option. Moreover, remember that you can always define multiple serialisers for the same model to fit your needs. Now, addressing your problem, that how I would solve it using my feature:

BlogSerializer < #...
  has_many :posts, include: include_nested_associations: true
end

PostSerializer < #...
  has_many :comments, include_nested_associations: true
  belongs_to :author, include_nested_associations: true
  # In this case, I want to have more info about the author because I want to display them after the blog post
end

AuthorSerializer < #...
  has_many :blogs, include_nested_associations: true
  has_one :bio
end

CommentSerializer < #...
  belongs_to :author
  # In this case, instead, I don't want anything else apart from basic informations (username, email, etc...) so no need for nested_associations
end

As I said before, you can always make even better defining different serialisers based on your needs. So in this case, could be a good idea to have a BlogAuthorSerializer and a CommentAuthorSerializer :)

NullVoxPopuli commented 9 years ago

thanks for explaining your your technique.

But yeah, I'm trying out this solution to the recursive problem: #1114 So, lets say you have these serializers:

CommentSerializer < #...
  belongs_to :author
end

PostSerializer < #...
  has_many :comments, include: :author
end

BlogSerializer < #...
  has_many :posts, include: [:comments]
end

when rendering a blog, you'd get posts and comments for each post, but not the author for the comments.

I think this is handy, in that you have a more explicit control of how deep the recursion goes

iMacTia commented 9 years ago

Yes, that definitely makes sense, and if you want the author too you just need to change the BlogSerializer like this:

BlogSerializer < #...
  has_many :posts, include: [comments: :author]
end

Recursive loop problem should be fixed as well, the problem is that if you want a standard behaviour (like you always want the comment author when you get a comment), you have to copy-and-paste the same "include" option in all serialisers, where with the include_nested_associations you delegate this to the serializer itself and you just need to manage special cases. Different solutions to the same problem, at the end. Having both of them allows you to choose the one that better fits your needs

NullVoxPopuli commented 9 years ago

Yeah, I don't think either should conflict with each other (functionality-wise anyway, I'm sure there are code conflicts, as I refactored a bit too)

NullVoxPopuli commented 9 years ago

@iMacTia so, lets say someone wants to always include the author on their comments, how do you feel about the following?

CommentSerializer < #....
  belongs_to :author, always_include: true
end

to notate that no matter who is rendering a comment, they also get the comment's author.

This makes for some interesting notation nuances for other seralizers:

BlogSerializer < #...
  has_many :posts, include: [comments: :author]
end

Here, the specification of author on comments is redundant and doesn't matter.

BlogSerializer < #...
  has_many :posts, include: [:comments]
end

I think this is what would be the most ideal for this scenario, even though it's not explicit as to what's being included in the resulting json.

iMacTia commented 9 years ago

@NullVoxPopuli, you actually raised another interesting point. I agree on this as it could actually address situations that both our solutions can't address without a copy-and-paste. I think it could be a nice addition to the set of options for associations. Maybe not needed in the first place, but if we want to have even greater flexibility, we can also later introduce an exclude option to override the always_include option from outside :) But again, that comes later, let's do one thing at a time. My PR for #952 is now obsolete and would need to be redesigned on top of your PR and the 0.10 version (I think it was still 0.9 when I started). I would suggest we go by step and we start improving the gem with your PR, that will introduce the include option. After that, we can start thinking together about all other options:

@joaomdmoura, do you like the schedule? Would really like to hear your thoughts on this guys :)

jfelchner commented 9 years ago

@iMacTia @NullVoxPopuli glad I found this. :grinning: I was just about to open an issue. This is exactly what we need out of AMS right now. I read through the issues. I think that include_nested_associations could be replaced by doing something like:

BlogSerializer < #...
  has_many :posts, include: :all_nested_associations
end

IMO this:

BlogSerializer < #...
  has_many :posts, always_include: true
end

Reads like "always include the posts in this output from the blog serializer". What that really means is "include all of the nested associations specified in this serializer". Also always_include sounds like it shouldn't be able to overridden, however the entire point of this is that the serializer's associations can be overridden.

Cons to this approach would mean that no one could have an association called all_nested_associations which I think is fairly safe.

@joaomdmoura we also need to think about how this would work for applying the JSON API spec which allows nested includes to be specified. See here.

joaomdmoura commented 9 years ago

Hey everyone, I've being following it along and have just read through the last comments. In really in favor of this implementation, really looking forward to get #1127 merged.

@iMacTia the schedule look great, indeed, let's take some baby steps, have the include will be awesome already and solve may of the existing needs. Then we can move forward with the other options.

Indeed @jfelchner, but the good news is that we already support neste associations following json-api conventions :grin:.

I think we all agree that this is aiming for the right direction, at least the first step on #1127. Let's get it merged then close this issue and open a new one to deal with always_include :smile:

beauby commented 9 years ago

One common pattern to avoid infinite loops in this case is to have a "light" serializer for non-primary resources (relevant for Json/FlattenJson, but not JsonApi). We could possibly make it so that primary resources are serialized according to ResourceSerializer, and by default, non-primary resources get serialized according to ResourcePreviewSerializer, if such a serializer is defined, and falls back to ResourceSerializer otherwise, with a warning/exception in case of infinite loop.

An other solution is to define the whole tree of included resources at the adapter level, JsonApi-style.

malroc commented 9 years ago

Including all nested associations was the default option in 0.8.x. All possible problems of this approach come from bad design and bad development practices. So why the interests of the bad developers should be respected more than the interests of the good ones? As @beauby mentioned, there is a good way to prevent unnecessary nesting in 0.8.x approach: create a separate light serializer for the association. In this case, every serializer is responsible only for its direct nodes (unlike include approach), which is better from the single-responsibility principle perspective. For example, we add a new association to a model B that is itself an association of a model A. Using include approach, we have to rewrite ASerializer in that case, which makes no sense (model A itself is unchanged). Explicit include ... is non-semantic. It's no different from using the same syntax in controllers (i.e. model.to_json(include: ...)) Adding even more options (exclude, always_include, etc) makes the syntax overcomplicated, comparing to the clean and simple approach of 0.8.x branch

malroc commented 9 years ago

Also, include approach breaks 'convention over configuration' principle, which stands behind the whole ideology of RoR (and old AMS btw).

malroc commented 9 years ago

Finally, include approach means that we have to handle atomic nodes and associations separately for an associated model: atomic nodes are handled by the serializer of the model itself, and associated nodes are handled by the serializer of the container model. Having in mind that the serializer of the associated model can have its own associations defined, we get a mess in the code base. Please take look at this structure (the simplest possible case):

class ASerializer
  attributes :d, :e, :f
  has_one :b, include: :c
end

class BSerializer
  attributes :g, :h, :i
  has_one :c
end

First of all, the code is duplicated (we include :c in both ASerializer and BSerializer). Second, for someone who is not familiar with the new syntax, it is unclear which piece of code is actually responsible for including c into a (i.e. has_one :c or include: :c). Last, if we include :c explicitly in ASerializer, why we don't do that for the atomic attributes of B (i.e. :g, :h, :i)? If we add an attribute j to class B, we need to rewrite BSerializer (which is quite logical). But if j is an association, not an attribute, then we have to rewrite both ASerializer and BSerializer to get the same result.

NullVoxPopuli commented 9 years ago

@malroc, I understand your concern, and that you may be upset with this change.

There has been quite a bit of discussion on this topic over the past months on various issues. The primary issue being solved is the infinite nesting that comes with always including every relationship.

To counter your argument for creating a separate 'light' serializer, for the association would result in something like this:

Say, we have the objects comments, blogs, authors, and posts, as in the tests for AMS.

If we wanted each association, but associations were deeply evaluated, that means we would need the following serializers for all scenarios - (which may be many people's case)

CommentsSerializer
BlogSerializer
AuthorsSerializer
PostsSerializer
# everybody would at least have the above, regardless of implementation.
# but to have optionally nested associations in the way you describe, we would need:
CommentsWithAuthorSerializer
BlogWithPostsSerializer
PostsWithAuthorSerializer
AuthorsWithCommentsSerializer
AuthorsWithPostsSerializers
AuthorsWithPostsAndCommentSerializer
# and then for nesting associations
BlogWithPostsWithAuthorsAndWithCommentsWithAuthorsSerializer
PostsWithComentsWithAuthorsSerializer

As you can see, this gets out of hand, pretty quickly.

Explicit include ... is non-semantic. It's no different from using the same syntax in controllers (i.e. model.to_json(include: ...)) Adding even more options (exclude, always_include, etc) makes the syntax overcomplicated, comparing to the clean and simple approach of 0.8.x branch

include is the only option on the association that does any sort of manipulating on an association. Additionally, the include options won't have to be on the serializers by the time official 0.10 is released. There will be the ability to set include from the controllers as well

render json: @objects, include: {...associations...}

this is consistent with the JSON API spec and adapter

Also, include approach breaks 'convention over configuration' principle, which stands behind the whole ideology of RoR (and old AMS btw).

have a look at this: http://jsonapi.org/format/#fetching-includes In general, we are following the json api conventions for our adapters - at least for relationships. The JSON API adapter will be the most adherant to specification, and follow all of the conventions set forth by the jsonapi team.

for someone who is not familiar with the new syntax, it is unclear which piece of code is actually responsible for including c into a (i.e. has_one :c or include: :c).

We will be sure to document as much as possible before the release of 0.10

Last, if we include :c explicitly in ASerializer, why we don't do that for the atomic attributes of B (i.e. :g, :h, :i)?

we do include these, see this comment: https://github.com/rails-api/active_model_serializers/pull/1127#issuecomment-138591729

malroc commented 9 years ago

If we wanted each association, but associations were deeply evaluated, that means we would need the following serializers for all scenarios - (which may be many people's case)

@NullVoxPopuli yes, you are right on that, and actually that's the only issue with the old (0.8.x) approach (please compare to the number of issues of the new approach I listed above). But think about it from another perspective: if we need several representations of the same object for different cases, isn't it logical to create several serializers? After all, that's what serializers are for. If we need to include an attribute in one case and don't need it in another, we do exactly the same: create 2 different serializers (in some cases that can be avoided, but in general that is the rule). That's the price we pay for keeping our code readable and not making it a mess. If include syntax were good enough, why would we need serializers at all if we can use the same syntax directly in our controllers?

NullVoxPopuli commented 9 years ago

if we need several representations of the same object for different cases, isn't it logical to create several serializers?

Yes, but doing that for associations isn't the way to go. I'm working on some tests right now to demonstrate using flat serializers, and specifying nesting in the controller.

If include syntax were good enough, why would we need serializers at all if we can use the same syntax directly in our controllers?

valid point -- the reason it's in the seralizers now, is because it was easier - a stepping stone to get to fuller functionality. This feature isn't final, and I plan to implement sideloading before RC4

NullVoxPopuli commented 9 years ago

@malroc here is the relevant commit that should address your issues: https://github.com/NullVoxPopuli/active_model_serializers/commit/5806cb79b9e8d263a58ca96df1b13c9ec8c76883

by default, nested associations are not rendered. So, specifying them in the controller would also allow for associations to be specified upon request from your frontend.

NullVoxPopuli commented 9 years ago

@malroc full diff here: https://github.com/rails-api/active_model_serializers/pull/1127/files

malroc commented 9 years ago

Yes, but doing that for associations isn't the way to go. I'm working on some tests right now to demonstrate using flat serializers, and specifying nesting in the controller.

So we step back to specifying nesting in the controllers? Again, why do we need serializers at all in that case? I can specify all attributes and nesting in the controllers right now, and I don't need serializers for that.

The first line of the AMS documentation states that:

ActiveModel::Serializer brings convention over configuration to your JSON generation.

And the new approach for handling nested associations breaks that.

NullVoxPopuli commented 9 years ago

I believe if we take convention over configuration in the way you want it, we just wouldn't have any serializers, and it would all be generated... -.-

NullVoxPopuli commented 9 years ago

(you can continue to use 0.8, btw - you don't have to upgrade) :-)

malroc commented 9 years ago

The support for 0.8.x is already dropped, so I can't continue using it in the long run.

malroc commented 9 years ago

I believe if we take convention over configuration in the way you want it, we just wouldn't have any serializers, and it would all be generated... -.-

It worked perfectly for 0.8.x, and I see no reason why it can't work now.

beauby commented 9 years ago

I have a design in my head that could be pretty cool for nested JSON stuff: building on 0.8.x, and the idea I mentioned above, we could make it so that when a nested related resource gets serialized, let's say Post > Comment > Author, we would first look for PostCommentAuthorSerializer, then CommentAuthorSerializer, then RelatedAuthorSerializer (which would be some sort of catch-all for non-primary Author resources), and finally AuthorSerializer. Or even better: first PostSerializer::CommentSerializer::AuthorSerializer, then CommentSerializer::AuthorSerializer, then AuthorSerializer, that way people could easily define their serializers in a nested way, and keep all the stuff in the same place. Example:

class PostSerializer < ActiveModel::Serializer
  attributes :id, :title
  has_many :comments

  class CommentSerializer < ActiveModel::Serializer
    attributes :id, :content
    belongs_to :author

    class AuthorSerializer < ActiveModel::Serializer
      attributes :id, :name
    end
  end
end

Thoughts everybody?

beauby commented 9 years ago

To be honest, I see use in both solutions:

malroc commented 9 years ago

@beauby +1 for your solution.

NullVoxPopuli commented 9 years ago

@beauby that is a very interesting idea. Only downside I can think is the one that @malroc mentioned in that you'd have to update attributes in many places if things change.

beauby commented 9 years ago

@NullVoxPopuli Not really, because if your Authors are always serialized the same way, you don't even need to define a nested serializer. And if you did define one, then it was because you had custom needs over the attributes/relationships you wanted to be included. I'll make a PR for the solution sketched above, we'll see whether it gets traction.

NullVoxPopuli commented 9 years ago

@beauby cool

NullVoxPopuli commented 9 years ago

for clarification:

class PostSerializer < ActiveModel::Serializer
  attributes :id, :title
  has_many :comments

  # overrides the default / top level CommentSerializer
  class CommentSerializer < ActiveModel::Serializer
    attributes :id, :content
    belongs_to :author

    # by default - no nested relationships are rendered
    # unless a serializer is specified here?
    # this could also be class AuthorSerializer < ::AuthorSerializer, 
    # if we wanted it to be the same attributes, yeah?
    class AuthorSerializer < ActiveModel::Serializer
      attributes :id, :name
    end
  end
end
beauby commented 9 years ago

Some news:

beauby commented 9 years ago

Let's continue this discussion in #1190.

joaomdmoura commented 9 years ago

Nice, moving then, I'm closing this one then in favor of #1190 :smile: