Closed GoogleCodeExporter closed 9 years ago
This sounds more like a bug in the parser than a memory leak. The testcase
should be
enough to reproduce and fix it, though - prod me if I don't get to it within
the next
couple of days.
Original comment by boulton.rj@gmail.com
on 29 Jul 2008 at 5:21
It was a bug in the parser - the parser didn't know about empty tags, so was
trying
to make a huge list of "br" tags be the parent of each "a" tag. This was using
up
vast quantities of memory.
I've fixed the parser to understand the list of standard html tags which are
empty,
and not to use up lots of memory when parsing them. Invalid html could still
cause a
large waste of memory, so it would still be good to improve the parser to avoid
this
happening. However, the immediate problem is fixed (with htmltotext release
0.7.2),
so marking this issue as such.
Original comment by boulton.rj@gmail.com
on 29 Jul 2008 at 7:44
Original issue reported on code.google.com by
tom...@metahusky.net
on 29 Jul 2008 at 3:28Attachments: