Open unhammer opened 1 year ago
@unhammer perhaps you could try this test with the latest master ? see #63
The issue remains :(
"fy fan". Ok this requires some deeper thinking.
@unhammer anyway, it's at least reassuring that the latest patch doesn't change the memory behavior of the library (kudos @mitchellwrosen )
longlines.xml.zip ↑ through xeno-dom exhaust heap memory. I just put the file into the list in SpeedBigFiles.hs as
[ benchFile ["xeno-dom"] "6MB" "longlines.xml.bz2"
and gotStrangely, only minor changes to the file (e.g.
sed 's/x/xx/g
– increasing the file size) will let it through with about 800M maxresident (as reported by /usr/bin/time). Inserting newlines after each>
we also get 800M maxresident, but it doesn't seem to be related to the long lines, as almost any change to the file helps.(Yes I should be using Xeno.SAX, but why does e.g. https://dumps.wikimedia.org/nowiki/20230520/nowiki-20230520-pages-articles-multistream-index.txt.bz2 at 11M go through fine with <400M maxresident and this one not? Even removing newlines, the wiki works fine. This feels like leakage.)