jenetics / jpx

JPX - Java GPX library
Apache License 2.0
206 stars 31 forks source link

JPX r1.5.0 Parsing of extensions end element without trailing whitespace leads to parsing error #78

Closed StevenDanielson closed 5 years ago

StevenDanielson commented 5 years ago

In the XMLReader class, there is an issue with the general element handling logic and how extensions documents are processed. There is no problem if whitespace/indentation is used between elements but if there is not indentation present there is an error.

When a parent element processes a new child element (START_ELEMENT case), it spawns a new reader which processes the child element. Once that child process is done reading the stream, the parent then advances to the next parseable XML stream entity.

When an extensions metadata element is encountered, the entire extensions stream is processed including its end element, leaving the code that parses to the next parseable XML stream entity missing that end tag processing, leading to stuck loop reading, where the parent is looking for its missing end element.

Take this as an example, with and without a space after

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><wpt lat="48.2081743" lon="16.3738189"><name>Wien</name><ele>171.0</ele><extensions><foo>asdf</foo><foo>asdf</foo></extensions></wpt></gpx>

Simple parsing flow with no whitespace after the extensions element START: gpx START: wpt START: name CHARACTERS: null END: name START: ele CHARACTERS: null END: ele START: extensions END: gpx END: gpx

Simple parsing flow with with the whitespace after the extensions element START: gpx START: name CHARACTERS: null END: name START: ele CHARACTERS: null END: ele START: extensions END: wpt END: gpx

The unit test GPXTest.readWriteRandomNonIndentedGPX() has a typo in that it always uses indentation as well. It should not have GPX.writer(" ").write(gpx, bout); `

jenetics commented 5 years ago

Merged into r1.5.0.