placemark / togeojson

convert KML, TCX, and GPX to GeoJSON, without the fuss
https://placemark.github.io/togeojson/
BSD 2-Clause "Simplified" License
410 stars 67 forks source link

KML: out-of-spec coordinates #68

Open handymenny opened 2 years ago

handymenny commented 2 years ago

The correct way to separate tuples in the coordinates field is to separate them with a space, but I saw some kml documents that used a comma instead:

<LineString>
   <coordinates>13.08925,37.517,0,13.0859,37.517,0</coordinates>
   <altitudeMode>relativeToGround</altitudeMode>
</LineString>

Google Earth parses them correctly, so is there any chance that this library will support this out-of-spec syntax?

tmcw commented 2 years ago

Haha, KML is such a trash fire.

So, I think we could parse this, but it's highly ambiguous. The altitude part of the tuple is optional, so if you have a list of 6 numbers separated by commas in this style, it could be 2 (longitude, latitude, altitude) positions or 3 (longitude, latitude) positions. And the only way to tell the difference is guessing, and it's unreliable to guess - the numbers can all be in the same general range.

handymenny commented 2 years ago

Thanks, I did some testing and google earth seems to use the same logic I implemented here: https://github.com/HandyMenny/leaflet-kmz/commit/b45dd44a3cfe45c395d5398281ca9843f22554d6

That is, every 3 numbers is a new tuple, so altitude would no longer be optional (in that syntax)

tmcw commented 2 years ago

So I guess the approach would be to first detect this invalid syntax, and then take the approach that altitude is mandatory? I wouldn't want to make altitude mandatory for all datasets, which would make valid data parse incorrectly.

handymenny commented 2 years ago

Yes, that would be a correct approach.

But I would just check if a coordinate has more than 3 elements (maybe better 6) and if the number of elements is multiple of 3 (actually google earth handles also cases where this is not true, using components of the previous coordinate). In that case the coordinate is split into two or more coordinates. Then a flatMap() takes care of recreating an array of consecutive coordinates