Closed richardhallett closed 3 years ago
@mfenner This bug has come about due to the fix that allowed multiple geoLocationPolygon elements to be parsed. While the bug fix is techncially correct for the schema, it's now introduced a backwards incompatability in the JSON representation and therefore the API.
As a refresher the XML specification allows multiple
<geoLocation>
<geoLocationPolygon>
<polygonPoint>
<pointLongitude>41</pointLongitude>
<pointLatitude>71</pointLatitude>
</polygonPoint>
<polygonPoint>
<pointLongitude>45</pointLongitude>
<pointLatitude>75</pointLatitude>
</polygonPoint>
<polygonPoint>
<pointLongitude>55</pointLongitude>
<pointLatitude>85</pointLatitude>
</polygonPoint>
<polygonPoint>
<pointLongitude>41</pointLongitude>
<pointLatitude>71</pointLatitude>
</polygonPoint>
</geoLocationPolygon>
<geoLocationPolygon>
<polygonPoint>
<pointLongitude>65</pointLongitude>
<pointLatitude>80</pointLatitude>
</polygonPoint>
<polygonPoint>
<pointLongitude>55</pointLongitude>
<pointLatitude>75</pointLatitude>
</polygonPoint>
<polygonPoint>
<pointLongitude>45</pointLongitude>
<pointLatitude>73</pointLatitude>
</polygonPoint>
<polygonPoint>
<pointLongitude>65</pointLongitude>
<pointLatitude>80</pointLatitude>
</polygonPoint>
</geoLocationPolygon>
</geoLocation>
Which would produce
[{
"geoLocationPolygon"=> [
[ {"polygonPoint"=>{"pointLatitude"=>"71", "pointLongitude"=>"41"}},
{"polygonPoint"=>{"pointLatitude"=>"75", "pointLongitude"=>"45"}},
{"polygonPoint"=>{"pointLatitude"=>"85", "pointLongitude"=>"55"}},
{"polygonPoint"=>{"pointLatitude"=>"71", "pointLongitude"=>"41"}}],
[
{"polygonPoint"=>{"pointLatitude"=>"80", "pointLongitude"=>"65"}},
{"polygonPoint"=>{"pointLatitude"=>"75", "pointLongitude"=>"55"}},
{"polygonPoint"=>{"pointLatitude"=>"73", "pointLongitude"=>"45"}},
{"polygonPoint"=>{"pointLatitude"=>"80", "pointLongitude"=>"65"}}
]
]
}]
This is a nested structure and follows the XML correctly.
However if you have a singular geoLocationPolygon within an element, it also nests it. e.g.
<geoLocation>
<geoLocationPolygon>
<polygonPoint>
<pointLongitude>41</pointLongitude>
<pointLatitude>71</pointLatitude>
</polygonPoint>
<polygonPoint>
<pointLongitude>45</pointLongitude>
<pointLatitude>75</pointLatitude>
</polygonPoint>
<polygonPoint>
<pointLongitude>55</pointLongitude>
<pointLatitude>85</pointLatitude>
</polygonPoint>
<polygonPoint>
<pointLongitude>41</pointLongitude>
<pointLatitude>71</pointLatitude>
</polygonPoint>
</geoLocationPolygon>
</geoLocation>
produces
[{
"geoLocationPolygon"=> [
[
{"polygonPoint"=>{"pointLatitude"=>"71", "pointLongitude"=>"41"}},
{"polygonPoint"=>{"pointLatitude"=>"75", "pointLongitude"=>"45"}},
{"polygonPoint"=>{"pointLatitude"=>"85", "pointLongitude"=>"55"}},
{"polygonPoint"=>{"pointLatitude"=>"71", "pointLongitude"=>"41"}}
]
]
}]
But before it produced
[{
"geoLocationPolygon"=> [
{"polygonPoint"=>{"pointLatitude"=>"71", "pointLongitude"=>"41"}},
{"polygonPoint"=>{"pointLatitude"=>"75", "pointLongitude"=>"45"}},
{"polygonPoint"=>{"pointLatitude"=>"85", "pointLongitude"=>"55"}},
{"polygonPoint"=>{"pointLatitude"=>"71", "pointLongitude"=>"41"}}
]
}]
So the question is what behaviour do we want to have? Should re reduce it to a non nested for backwards compatability? or just always say this is the case now with it being an array of arrays, however old dois will have a different format.
This is a common issue with arrays with one element in XML and how to represent them in JSON. I have seen variations of this in a few other places in our metadata, and I would address it to look like our previous implementation.
If that is not possible without breaking changes, put it in API version 3.
Also, check the DataCite JSON for other examples where an element can be 0-N, e.g. fundingReference or relatedIdentifier (or 1-N for creator or title).
I looked at fundingReference and creators and they always return an array regardless if there is only one element, which to me makes sense. The problem is the fact I think geoLocationPolygons should have done it this way but it didn't historically.
What I will look at doing is seeing if we can keep the old way for when there is one element and push it to an array when there is more than one, this will keep backwards compatability and still allow multiple if desired.
Ideally this should be cleaned up but I think it will be a reasonable compromise.
Sounds good. And please add to a list of "API version 3 issues", things we can address properly in a new API version.
This appears is generating a nested list now even when there is only one set of geolocationpolygns.
Example new DOI 10.25574/17213 - Nested Example old DOI: 10.25574/13609 - Not tested, singluar list.
Front conversations