json-api / json-api

A specification for building JSON APIs
https://jsonapi.org
Creative Commons Zero v1.0 Universal
7.43k stars 843 forks source link

Resource names vs relationship names #139

Closed jonah-williams closed 10 years ago

jonah-williams commented 10 years ago

I would like to clarify how a JSON API is expected to return related resources, particularly in a compound document.

I think this was alluded to in #23 as a "name-aliased relationship" but may have been read as applying to server-side business object types.

In a compound document related resources are included under top level keys which match the name of their relationship rather than their resource type. I think this poses problems for server implementations attempting to avoid sending duplicate data and for client implementations attempting to minimize the cost of parsing responses.

Let's consider the following response (adapted from the compound document example on http://jsonapi.org/format/#url-based-json-api) where multiple relationships exist to the same resource type and references to the same resource appear under multiple relationships.

{
  "links": {
    "posts.author": {
      "href": "http://example.com/people/{posts.author}",
      "type": "people"
    },
    "posts.subject": {
      "href": "http://example.com/people/{posts.subject}",
      "type": "people"
    }
  },
  "posts": [{
    "id": "1",
    "title": "Steve seems to have opinions about computers",
    "links": {
      "author": "127",
      "subject": "4"
    }},
    {
    "id": "2",
    "title": "Jonah sounds confused",
    "links": {
      "author": "4",
      "subject": "127"
    }}
  ],
  "author": [{
    "id": "127",
    "name": "@zalambar"
  },
  {
    "id": "4",
    "name": "@steveklabnik"
  }],
  "subject": [{
    "id": "4",
    "name": "@steveklabnik"
  },
  {
    "id": "127",
    "name": "@zalambar"
  }]
}

This is not great but seems to me to be what the spec currently calls for. On the server side we're paying the cost of serializing the same object multiple times. On the client side we're downloading an unnecessarily large payload and parsing more objects than should be necessary (and building relationships between them which can be expensive). Additionally it is not trivial for the client to maintain a list of all known "people" resources. The client may have already received some of these subject or author resources as people but it is not easy to identify these duplicates.

I think it would be preferable for compound documents to include a collection per-resource rather than per-relationship:

{
  "links": {
    "posts.author": {
      "href": "http://example.com/people/{posts.author}",
      "type": "people"
    },
    "posts.subject": {
      "href": "http://example.com/people/{posts.subject}",
      "type": "people"
    }
  },
  "posts": [{
    "id": "1",
    "title": "Steve seems to have opinions about computers",
    "links": {
      "author": "127",
      "subject": "4"
    }},
    {
    "id": "2",
    "title": "Jonah sounds confused",
    "links": {
      "author": "4",
      "subject": "127"
    }}
  ],
  "people": [{
    "id": "127",
    "name": "@zalambar"
  },
  {
    "id": "4",
    "name": "@steveklabnik"
  }]
}

Does this seem reasonable? Am I misunderstanding part of the current spec?

jonah-williams commented 10 years ago

If it is not always desirable to add a top level links entry I've found it useful to replace id identifiers with objects of the form:

{
  "id": 1,
  "type": "people"
}

However that easily leads to polymorphic relationships so I imagine that would not be desirable here.

jonah-williams commented 10 years ago

When a JSON Schema is available to document the API that also serves as a great way to define these relationship-resource mappings using the sub-schema references available in v4. The approach above also eliminates needing to list every relationship as a top level key in the schema.

gr0uch commented 10 years ago

@jonah-carbonfive very good point. I had assumed that this was a typo or mistake in the spec :confused:

jonah-williams commented 10 years ago

One more case for your consideration. What should a compound document look like if it includes multiple resources with relationships of the same name which reference different resource types?

In the above example if posts.subject is of type people what happens when I want to add a comment resource which includes comment.subject of type post? As that seems either impossible or else requires a polymorphic subject collection.

samwgoldman commented 10 years ago

@jonah-carbonfive, regarding your second comment, where two documents refer to associated documents of the same "type" by different aliases, it might be better to rely on some information encoded into the clients.

Imagine the client has received this resource from the server:

{
  "people": [
    {
      "id": 1,
      "name": "Foo",
      "links": {
        "friends": [2],
        "enemies": [3]
      }
    },
    {
      "id": 2,
      "name": "Bar"
    },
    {
      "id": 3,
      "name": "Baz"
    }
  ]
}

Person#friends and Person#enemies should be of type Collection[Person], but the associations have context-specific names.

Instead of encoding this knowledge in the representation of this resource, I think it's fair to assume the client will know where to look. Maybe the client has a model definition like this:

Person = JSONAPI.Model.extend({
  "name": attr(),
  "friends": hasMany("people"),
  "enemies": hasMany("people")
})

If we want to admit polymorphic has many associations (I'd support it), we would then have to encode the type in the link arrays somehow, but that's a separate issue.

Thoughts?

jonah-williams commented 10 years ago

@samwgoldman I've been assuming that one of the design goals for JSON API was to minimize any such assumptions or out of band communication between client and server (largely based on seeing https://github.com/json-api/json-api/issues/28#issuecomment-17654890 link to Roy Fielding's post). I think we agree in principle. The example you provided looks good to me. I'd just like to be able to state how the client should learn about such associations. I don't think that information necessarily has to be included or parsed in every response but it would be nice to have it available somewhere (such as in a schema linked in a content-type header i.e. http://json-schema.org/latest/json-schema-core.html#anchor33). I also think polymorphic collections are worth considering but I'm trying to leave them out of this issue. I see they were rejected in #23 and I don't want that to sidetrack discussion of how to handle other references. Let's see how this resolves before considering re-opening that debate?

From #138 maybe this is just an error in the example? What do we need to do to determine if that is the case and what are the implications of applying such a change?

samwgoldman commented 10 years ago

@jonah-carbonfive I don't think it's useful for a client to "learn" about associations like this. Imagine a client that had to learn that a person's friends were people. How would you write useful software? How would you write an algorithm that, say, found the most "evil" person—the person who was an enemy for the most other people?

I'm definitely an advocate for REST and Fielding's thesis, but if we ever want to have a finalized JSON API spec, I think we need to keep things practical. In this case, it's practical to encode the association type in the client, because the client needs to do useful work.

I agree that this should not be a discussion about polymorphic collections.

I think #138 is an error in the example. Not sure if any committers are still paying attention. Not sure who's buy-in we'd need to get that PR merged.

jonah-williams commented 10 years ago

@samwgoldman in my current client implementation this sort of behavior (including types in responses) has been useful. The client certainly encodes associations types in models and controllers but the ability to "discover" them allowed us to write a parser which does not need to be aware of those mappings. Useful to separate from the client's domain model but not critical. I'm hoping we will eventually initialize this client app and it's parser with a JSON schema in order to get some automatic validation checks and eliminate some of these encoded relationships. We're always going to have client code accessing specific properties or relationships on models obtained from an API though so I'm not trying to imagine a world in which a client can be written in ignorance of these API resources.

Regarding #138:

The id style example (https://github.com/json-api/json-api/blame/gh-pages/format/index.md#L173-L176) behaves as I would expect but I'm not sure how old the url style version (https://github.com/json-api/json-api/blame/gh-pages/format/index.md#L434-L456) is. Let's see what the impact on existing implementations would be.

gr0uch commented 10 years ago

@jonah-carbonfive I haven't implemented compound documents in fortune yet, but nowhere does it name a resource after a relationship. Following a related link like /post/:id/author will still return a top-level people. Perhaps this deviates from the current spec, not sure yet.

jonah-williams commented 10 years ago

@daliwali thanks for the clarification. I'm hoping that the behavior you have implemented matches the intent of the spec if not the letter of the example. We'll see.

steveklabnik commented 10 years ago

Hey, just letting you know that I dropped by to read this, and a few comments:

  1. It is undeniably true that we won't have the smallest possible representation. This is to keep information in-band, as was alluded to.
  2. Clients can learn things about servers and services. Your web browser knows nothing of Google other than the HTTP, CSS, and JavaScript (etc) specs. Yet it can successfully interact with Google and its API. For JSON-API, imagine an "api browser" app that just knows JSON-API, and nothing of your API, yet can still give you a useful overview of what the service is and how it does it.
  3. That said, "practicality" is 100% a goal. But we value shipping code over even theoretical practice-ness. ;)

Therefore, I believe that

I think this poses problems for server implementations attempting to avoid sending duplicate data and for client implementations attempting to minimize the cost of parsing responses.

basically articulates the trade-off: you might get some extra data, you might not have the absolute minimum parse. This is made up for elsewhere.

I would really like to hear from @wycats though, as he knows the ember side much more than I do.

Once I'm off my OSS hiatus, I plan on sorting all these kinds of things out fully and writing them up, so the discussion here is valuable regardless, thank you.

dgeb commented 10 years ago

I just talked with @wycats, who confirmed that the original example was in error. That example has been superseded by a new example in PR #143, which introduces the new top-level linked object and correctly groups documents by type and not relationship.

As for discoverability, the links section can clarify the type of each document by relationship. This is covered in the same updated example.

BTW, this is consistent with the expectations of both Ember Data and Active Model Serializers.

jonah-williams commented 10 years ago

Excellent, thanks @dgeb and @wycats that will let me move forward with an implementation.