During stream-parse, the parser only deletes the previous element from the node to free memory. However, this is not accounting for any unwanted nodes between the targets. For example, if there is a new line character between tags, those TextNodes did not get cleaned.
After each iteration, a new TextNode was getting added as previous siblings of the next target, causing the performance to slow down with the number of elements parsed.
In my case, after 10k elements, the parser would have to first iterate over 10k TextNodes, which makes impossible to stream-parse large documents.
Coverage increased (+0.07%) to 92.605% when pulling 60cd95807fff3dce11551b6b637c42d6016b511e on Seb-C:fixStreamLeakIssue into e73954f0f504eaf97f73ad62a4c52419e304b7bd on antchfx:master.
During stream-parse, the parser only deletes the previous element from the node to free memory. However, this is not accounting for any unwanted nodes between the targets. For example, if there is a new line character between tags, those TextNodes did not get cleaned.
After each iteration, a new TextNode was getting added as previous siblings of the next target, causing the performance to slow down with the number of elements parsed.
In my case, after 10k elements, the parser would have to first iterate over 10k TextNodes, which makes impossible to stream-parse large documents.