CrossRef / rest-api-doc

Documentation for Crossref's REST API. For questions or suggestions, see https://community.crossref.org/
Other
721 stars 270 forks source link

missing objects instead of `null` #551

Closed maxheld83 closed 3 years ago

maxheld83 commented 3 years ago

I'm staring from the assumption that it's desirable for an API to show null (or empty arrays) for empty fields. For our use case, this would enable length-stability in our api and production code, which we're currently lacking.

If I'm missing a central tenet of crossref data, or just being boneheaded stop me right here.

Such "length stability" also seems to be considered a best practice by some:

If a field has no value, it shall be null.

This one is quite trivial, but is has a few nuances. Non existing numbers, strings, and booleans are > usually represented as null. But string fields without value should also be represented as null, not "".

Note: Empty arrays shall not be null. If a field is some kind of list and it is represented by an array, an array shall be returned, even if empty. This makes front-end developers’ work a lot easier.

Example:

{
   "id": 16784,
   "name": "Lorem Ipsum",
   "age": null,
   "relatives": [], // no relatives
   "address": null
}

This appears not to be the case for the Crossref REST API.

For illustration:

http://api.crossref.org/works/10.1038/s41598-020-57429-5 includes:

    "issue": "1",
    "license": [
      {
        "URL": "https:\/\/creativecommons.org\/licenses\/by\/4.0",
        "start": {
          "date-parts": [
            [
              2020,
              1,
              31
            ]
          ],
          "date-time": "2020-01-31T00:00:00Z",
          "timestamp": 1580428800000
        },
        "delay-in-days": 0,
        "content-version": "tdm"
      },
      {
        "URL": "https:\/\/creativecommons.org\/licenses\/by\/4.0",
        "start": {
          "date-parts": [
            [
              2020,
              1,
              31
            ]
          ],
          "date-time": "2020-01-31T00:00:00Z",
          "timestamp": 1580428800000
        },
        "delay-in-days": 0,
        "content-version": "vor"
      }
    ],
    "funder": [
      {
        "DOI": "10.13039\/501100008566",
        "name": "Tomsk State University",
        "doi-asserted-by": "publisher",
        "award": [
          "Tomsk State University competitiveness improvement programme"
        ]
      }
    ],

http://api.crossref.org/works/10.1109/JLT.2019.2961931, in the same place, only includes:

   "issue": "13",
    "funder": [
      {
        "DOI": "10.13039\/501100001659",
        "name": "Deutsche Forschungsgemeinschaft",
        "doi-asserted-by": "publisher",
        "award": []
      },
      {
        "name": "Collaborative Research Center",
        "award": [
          "787"
        ]
      }
    ],

The later is missing the license field. As per the above, there should be an empty array in it's place, like so:

   "issue": "13",
   "license": [],
    "funder": [
      {
        "DOI": "10.13039\/501100001659",
        "name": "Deutsche Forschungsgemeinschaft",
        "doi-asserted-by": "publisher",
        "award": []
      },
      {
        "name": "Collaborative Research Center",
        "award": [
          "787"
        ]
      }
    ],
ppolischuk commented 3 years ago

Thanks for the write up and for some background information on why null or empty arrays would help out in this case.

I talked to some of the team and it seems as a general principle we never use null and use an empty array for optional values. We did find some examples inconsistent with this general principle, but that's the approach we're taking for now.

As mentioned in some other issues (like #549), we don't have any documentation as to why these decisions were originally made, and the folks who worked on the REST API at the time aren't around anymore. We may end up revisiting how we handle null vs empty arrays vs missing objects sometime down the line, but for now we are working on other aspects of the API, most importantly, the previously mentioned Elasticsearch migration.