apiaryio / mson

Markdown Syntax for Object Notation
MIT License
901 stars 180 forks source link

Null primitive type #26

Closed netmilk closed 8 years ago

netmilk commented 9 years ago

In this Dredd issue is @neonichu having a body JSON key with null value. There is no way how to express it in MSON at this moment. Can you please add support for null primitive type in MSON?

zdne commented 9 years ago

@netmilk @neonichu First quick thought: MSON tries to be somewhat abstract to serialization formats. It is not directly JSON in Markdown. It should be possible to represent an MSON document in JSON / XML / YAML / I do not know what. Sure some information may be lost given the fidelity of selected representation (serialization) format. However I am not sure whether introducing null as a base primitive type is the best portable solution.

Please help me understand the need for null here: What is the use case? And if the use case is in JSON – how would the similar use case work in XML?

Thanks!

zdne commented 9 years ago

MSON is – in fact – a strongly typed language. Every value has its type (with string and object being the default). The only situation when a type is undefined is when using generics. I would like to retain this but that mean no to a null type. But I am open to discussion.

Another thing to consider is to introduce an undefined as a value (NOT type). For the instances where you do want to say the value is not defined (but still of a given type). For example consider empty and undefined string.

zdne commented 9 years ago

Action point: Remove null as a value of JSONs from MSON documentation.

honzajavorek commented 9 years ago

Let's think about it in terms of use cases.

Real-world data often contains some form of "no value" and the concept is present in virtually every language (like Python/Ruby/... you name it), but MSON isn't a programming language. It's not even data carrier, such as JSON, YAML or XML. I see MSON as schema notation.

So we want to be able to describe schema of real-world data structures. What does null actually mean in people's data structures?

If we want to describe properties which are not set, then in JSON output we probably shouldn't generate them these properties at all. What if the intention is to have an empty string as sample value? And what if the property is always present, but sometimes it can be "unset" and in someone's API this is implemented as {"property": null}?

I think we should discuss all these use cases, sort them out and find a way how to describe all of them them in MSON.

zdne commented 9 years ago

contains some form of "no value"

I am OK with introducing some for of "no value" however this issue, to me, is about "no type" or "null" type to be precise. And as I have stated above MSON is typed notation and everything has a value of a certain known type unless it is generic. Therefore null as a type does not make much sense.

Thoughts?

honzajavorek commented 9 years ago

Good point about confusion of "no value" and "no type"!

I think "no type" makes no sense and I don't see any use case for that. I believe the original issue @netmilk referenced is just about being able to describe a structure which contains null, but which, in some moment, could gain a value, which has valid type. E.g. {"color": null} - if "no value" would be present in MSON, such case could be described as color property of type string, which has "no value". Currently, there is no way how to describe it, though. That's I believe what the whole issue is about.

I can imagine these situations:

Does it make sense to you?

zdne commented 9 years ago

remember

no type, some value: invalid MSON

is string (unless nested members then it is object)

neonichu commented 9 years ago

Just chipping in really quick as the one who originally created the issue. Removing null as a valid value from MSON would be a huge deal breaker for us, as we are dealing with modelling an existing API here.

honzajavorek commented 9 years ago

@zdne Yeah, I just realized that I forgot about implicit types and I've got stuck thinking about all the implications :)

zdne commented 9 years ago

@neonichu the null as a value was never really there. It was just incorrectly stated in the REDAME. Are you saying that you need to specify cases where a value of an object of certain type (e.g. string) is null in JSON representation?

zdne commented 9 years ago

@neonichu

Removing null as a valid value

Would the property in question be of a null (undefined) type or will it just have null (undefined) value?

neonichu commented 9 years ago

@zdne Just a null value

zdne commented 9 years ago

seems this is a very interesting topic for many: http://stackoverflow.com/questions/21120999/representing-null-in-json

zdne commented 9 years ago

Given an object with myCount property of a number type what to render if value is absent?

{}

vs.

{
    "myCount": null
}

vs.

{
    "myCount": 0
}

Good luck getting this right :)

zdne commented 9 years ago

Interestingly enough this feels like it makes sense:


Edit: Albeit I would say Objects with no value should be stated thus:

honzajavorek commented 9 years ago

Good point about empty objects and arrays:

This is also something which is implemented in a different way every time and we should take it in account when thinking this through.

robbinjanssen commented 9 years ago

Adding some more information here, we're currently building an API that follows the JSON API. The JSON API format has some rules about this as well; see http://jsonapi.org/format/#document-resource-object-linkage

For example, in our documentation i've defined

+ related_object: null (object, required) - The details of the related object, is null if no relation is defined.

The API returns this if there is a relation:

{
  "related_object": {
    "type": "related_objects",
    "id": 12345
  }
}

And if there's no relation:

{
  "related_object": null
}
zdne commented 9 years ago

+ related_object: null (object, required) - The details of the related object, is null if no relation is defined.

My issue with this is that it is way too JSON-specific. How would this work with XML / YAML ?

This seems to boil down for me to one thing – format-specific section explaining some serialzation nuances such as – "if some property is not set it's key value is null vs. if some property is not set its key (and thus the whole property) is not present"

hobofan commented 9 years ago

@zdne I don't think that it is too JSON-specific:

XML:

<book>
    <title></title> // Bad since it is similar to a title of ""
</book>

<book>
    <title xsi:nil="true"/> // The official way
</book>

YAML:

# both represent an empty value
key1: null
key2: ~

I don't see a viable way to represent "there should/could be a value under this key, but there is none" in MSON without a null type and I don't think that optional keys accurately represent it either.

I don't thin that the statement "Markdown Syntax for Object Notation (MSON), a Markdown syntax compatible with describing JSON" (as seen on top of the README) is true without a null type.

jayniz commented 9 years ago

+1 to what @robbinjanssen said.

Do I understand it correctly: optional attributes (as in null/nil/non values) in responses can't be expressed in MSON?

{
  "name": "Jimmy",
  "linked_account": null
}

I would have expected that I could just make email optional:


+ Response 200 (application/json)
  + Body
    + Attributes
      - name: Mila (string, required) - Given name
      - email: hello@max-and-mila.com (string, optional) - User's email
robbinjanssen commented 9 years ago

@jayniz Your current response implies that the response COULD contain a field email, and IF it contains email the value MUST BE of type string.

In my case, and probably yours as well, the server always returns email but it CAN be of type string OR null. (and is therefor 'optional')

jayniz commented 9 years ago

@robbinjanssen Exactly.

Just to make sure I understand it correctly: Even though the MSON looks like it would do that, when checked with dredd a response with an email: null will be a failure, and that is by design?

Edit:

The unexpected behaviour for me was, that this:

+ Response 200 (application/json)
  + Body
    + Attributes
      - name: Mila (string, required) - Given name
      - email: hello@max-and-mila.com (string, optional) - User's email

converts to this json schema:

{
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "description": "Given name"
    },
    "email": {
      "type": "string",
      "description": "and-mila.com (string, optional) - User's email"
    }
  },
  "required": [
    "name"
  ]
}

when I would have expected this (difference in the type of email):

{
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "description": "Given name"
    },
    "email": {
      "type": ["string", "null"],
      "description": "and-mila.com (string, optional) - User's email"
    }
  },
  "required": [
    "name"
  ]
}

The latter runs fine in dredd with no complaints when the email isn't present. Not sure if mson is to blame here, though, or if dredd should just allow null values for fields that are not required?

TLDR:

- email: hello@max-and-mila.com (string, optional) - User's email

becomes

  "email": { "type": "string" }

and not

  "email": { "type": ["string", "null"] }
pksunkara commented 9 years ago

@jayniz Let's think about the situation where an user object has description property which is optional when update the user profile. You can see 3 cases below on how an ideal API server acts on the request body.

In this case, allowing null as a value for an optional property is completely wrong.

jayniz commented 9 years ago

@pksunkara I don't believe in such a thing as an ideal api server.

But still - I assume you're talking about a PUT on the user's properties? In that case, the whole properties object should get updated with whatever you send, so a null description will become a null description (or, if you wish, an empty string).

If, however, you want to model a special case where you PUT data to a resource, but really do something like a PATCH and just replace some properties of that resource, then yeah, the behaviour you described could be applicable. I'm not saying that either of the behaviours is ideal, or better.

Real APIs will have both behaviours, and so this should be expressible in MSON as it is in json schema, don't you think?

robbinjanssen commented 9 years ago

@pksunkara Correct, but you are talking about a POST/PUT request, where @jayniz is talking about a GET request. So there's a difference there.

I'll try to explain my case a little more. We're implementing JSON API spec, where JSON output is expected to be as:

{
  "type": "articles",
  "id": "1",
  "attributes": {
    "title": "Rails is Omakase"
  },
  "relationships": {
    "author": {
      "links": {
        "self": "/articles/1/relationships/author",
        "related": "/articles/1/author"
      },
      "data": { "type": "people", "id": "9" }
    }
  }
}

In API Blueprint you would define something like:

+ Response 200 (application/json)

    + Attributes
        + type: `articles` (string, required)
        + id: `1` (number, required)
        + attributes (object, required)
            + title: `Rails is Omakase` (string, required)
        + relationships (object, required)
            + author (object, required)
                + links (object, required)
                    + self: `/articles/1/relationships/author` (string, required)
                    + related: `/articles/1/author` (string, required)
                + data (object, required) - This can be `null`!
                    + type: `people` (string, optional)
                    + id: `9` (number, optional)

However, JSON API spec states that when the object has NO relationship, you need to define the key (author in the example above) with a data child-key with value null.

{
  "type": "articles",
  "id": "1",
  "attributes": {
    "title": "Rails is Omakase"
  },
  "relationships": {
    "author": {
      "links": {
        "self": "/articles/1/relationships/author",
        "related": "/articles/1/author"
      },
      "data": null
    }
  }
}

Again; API Blueprint:

+ Response 200 (application/json)

    + Attributes
        + type: `articles` (string, required)
        + id: `1` (number, required)
        + attributes (object, required)
            + title: `Rails is Omakase` (string, required)
        + relationships (object, required)
            + author (object, required)
                + links (object, required)
                    + self: `/articles/1/relationships/author` (string, required)
                    + related: `/articles/1/author` (string, required)
                + data (object, required) - Should be `null`!

But the data key is rendered as an empty object. data: {}

pksunkara commented 9 years ago

@robbinjanssen I understand your use case. But we cannot by default allow null. We need to very careful here. Whatever we introduce, we need to make sure that it is very configurable.

zdne commented 9 years ago

Let's look at at this simple object:

# Book (object)
- title (string, required) - required title of the book 
- subtitle (string, optional) - optional book subtitle

First, let's discuss the title property: Given I haven't came up with the title just yet, following is valid (to the MSON prescription):

{ "title": null }
{ "title": "" }
{ "title": "A" }

Because MSON simply says "the title property has to be present.

:white_check_mark:

Following is not valid JSON for the given MSON:

{}

Because MSON criteria wasn't met (the property is missing).

:x:

Second, let's look at subtitle: All of the previous examples are valid and so is following:

{ "title": "A", "subtitle": null }
{ "title": "A", "subtitle": "" }
{ "title": "A", "subtitle": "value" }

:white_check_mark:


Now given the MSON Book definition there is not enough information how to represent a situation when subtitle is not set in JSON form. Or C++ form, or JavaScript form.

Is following correct to express there is no subtitle set:

{ "title": "A", "subtitle": null }

what about this

{ "title": "A"}

or in C++

std::string subtitle;

or in Swift:

let subtitle = String()

or perhaps:

let subtitle: String?

My point is. We CANNOT tell. ALL of the previous is correct from the MSON standpoint. But what to generate from MSON what writer wanted? Problem is too often the writer is too tied to a particular representation. That is OK but then this is an option for the particular serialization provider instead of for generic, canonical description.

zdne commented 9 years ago

So to not leave it open. The solution is obviously to mark the preference of whether a property with no value is:

  1. present with a nil value equivalent, if possible
  2. not present in the hash at all

I see about three ways how to achieve this:

1. Simplest, but will result into new keyword for no values null:

# Book (object)
- subtitle: null (string, optional)

2. null as a type, which I actually want to avoid as it looses the information on the real type

# Book (object)
- subtitle (null, optional)

3. nullable as a type modifier

# Book (object)
- subtitle (string, optional, nullable)

4. We will leave this as an option to representation producer so user could decide (e.g. "generate key without values")

Thoughts?

pksunkara commented 9 years ago

To record my opinion: I like nullable. It sounds awesome. :smile:

neonichu commented 9 years ago

I like option 1, simple and most obvious.

zdne commented 9 years ago

It should be noted that regardless of 1, 2 or 3 the following will be valid:

# Book (object)
- subtitle (string, required, nullable)

(using required with null)

pksunkara commented 9 years ago

Forgot to say this in my previous comment. Option 1 will give us problems. Especially if a user wants to have a sample value inline but also want to say that the property can be null. Again, I find nullable much more easier to deal with.

zdne commented 9 years ago

Now this still does not solve the issue for:

# Book (object)
- authors (array, optional)

where you want to distinguish between

{ "authors": [] }

and

{ "authors": null }

Possibly nullable will be for the later but how to distinguish between a property is not present at all and property is present with an empty value? That is

{} 

vs.

{ "authors": [] }
hobofan commented 9 years ago

I am not quite sure if this doesn't already imply something else, but what about:

# Book (object)
- subtitle (enum[string,null], required)

where null would be a type.

Apart from that I think 3 sounds good.

zdne commented 9 years ago

@hobofan what about the {} vs { "authors": [] } scenario? – https://github.com/apiaryio/mson/issues/26#issuecomment-132221842

pksunkara commented 9 years ago

Possibly nullable will be for the later but how to distinguish between a property is not present at all and property is present with an empty value? That is

Golang's core JSON encoder had a similar problem until they introduced a tag called omitempty. Since we are thinking about a nullable tag, why not another? I don't know if this will be a slippery slope or not.

hobofan commented 9 years ago

@zdne : I think that this is another issue (although a similar one) that only applies to arrays. Arrays are the only type I think that can possibly be empty (if you allow strings to be empty you are opening another can of worms of "" vs " ").

Another point that speaks for a nullable type in general, beside the required/optional stuff is that it would allow arrays like {"array": [null, "string1", null]} (valid JSON) to be described.

jrep commented 9 years ago

@hobofan, aren't empty maps/hashes/objects/pick-your-namespace also possible? Consider tagging system, and an endpoint intended to return "for all the tags applied to this page, the list of other pages bearing that tag." So a page marked with tag1 and tag2 might report

{
    "tag1": [ page1, page4],
    "tag2": []
}

(meaning this is the only page with tag2)

But another page might return

{}

(meaning "this page has no tags")

jrep commented 9 years ago

There's a philosophical question, perhaps: is MSON intended to express any legal JSON? Or to embody some "best practices" restrictions?

Personally, I'm in the position of post-hoc documenting a couple hundred existing APIs (or, rather, helping nearly as many developers post-hoc their existing APIs). There's a fair bit of "I wouldn't do it this way if I had it to do again, but this API's already in use, I can't change casually." Does this backwards-compatible-with-the-suboptimal put me out of MSON's target?

robbinjanssen commented 9 years ago

@zdne @hobofan For me the nullable option would work best, I think with the enum type declaration you leave too much room for interpretation because you can add as many types as you want.

# Book (object)
- title (string, required)
- chapters (object, optional, nullable)

Valid JSON output:

{ "title": "Great book" }

{ 
  "title": "Great book",
  "chapters": null
}

{ 
  "title": "Great book",
  "chapters": {
    "foo": "bar",
    "bar": "foo"
  }
}

Sounds strange, but how about marking this JSON with the specification above as invalid. Meaning an object with a nullable definition cannot be an empty object ({}). It either requires an object with data or must be null?

{ 
  "title": "Great book",
  "chapters": {}
}
jrep commented 9 years ago

If nullable takes on the "no empties allowed" meaning, then it seems to me mandatory also to include @pksunkara 's omitempty. Elaborating @robbinjanssen 's spec a bit:

My combinatoric spider-sense is warning me that "three representations, each with two possible states" is eight possibilities, which simply can't be expressed in only two flags ...

itsjamie commented 9 years ago

@jrep You don't have only two flags though, you have three, the combination of (nullable, omitempty) becomes another possible state. You've done the combinations above :+1:.

obihann commented 9 years ago

I'm going to cast my vote for the addition of nullable and omitempty, they would be a huge help. I may take the step and fork this to update the spec so then we have something to go forward on.

zdne commented 9 years ago

@jrep and in addition there is optional so is

? ;-)

I like your point here

Does this backwards-compatible-with-the-suboptimal put me out of MSON's target?

Personally, I would of course love to design new "clean" systems without the burdens of past. However it is clear that for MSON to be able to describe JSONs completely we will have to do something.

Note: My original hope for MSON was to turn it into modeling language that can also describe some serialization formats (e.g. JSON). With that being said, I feel the combo optional - nullable - omitempty is inelegant.

That brings me back to the question – isn't that just a serialization option?

pksunkara commented 9 years ago

modeling language that can also describe some serialization formats (e.g. JSON)

That brings me back to the question – isn't that just a serialization option?

Well, if we want to turn MSON into a modelling language for JSON, shouldn't we add JSON specific options?

itsjamie commented 9 years ago

@zdne One aspect that we were taking advantage of with MSON (or now that we're here, it's more accurate to say.. hoping to I suppose), was the ability to define our data structure one, and get quite a few tools to help us validate our API.

By writing MSON, we got JSON Schema for free, which has the capability to define keys as two potential acceptable values.

I understand your point though on the inelegance. More options explodes the complexity very quickly, and the chance for bugs in implementations. Perhaps it would be best for MSON to support 80% of the functionality for 20% of the effort. In tools where MSON is used then, like Drafter, the ability to reference JSON Schema directly would cover our use-cases. This might be a case where the further abstraction should limit its' surface area?

itsjamie commented 9 years ago

I didn't realize at the time that there was no way to represent a JSON null in MSON. That should definitely be implemented.

The other aspect that came up in this ticket I'm on the fence about though.

Apologizes for the confusion.

obihann commented 9 years ago

If we support the JSON type null we may not need the nullable flag, since we could do something like this:

+ (enum)
   + (null)
   + (string)
zdne commented 9 years ago

Well, if we want to turn MSON into a modelling language for JSON, shouldn't we add JSON specific options?

But we don't :) MSON should be agnostic to serialization media type – it is not a replacement for JSON, nor it should be.

zdne commented 9 years ago

@obihann thanks for the comment!

Per my previous comment – I am against the null as a type because that hides the true type of the value. Furthermore this still does not solve the situation when, in JSON serialization you want to distinguish between an "empty" value and "omitted":

{ "key": {} }

vs.

{}