Some parsing in this lib is written or refactored naively. For example, I'm not so sure that it was a net improvement to create a text decoder that used lookahead to be self-contained, or at the very least I need to do some performance inspection.
One possible performance quagmire is how I've introduced html entity leniency, or how I accept, say, "<" as text instead of requiring it to be escaped "<". Off the top of my head this might result in massive backtracking if a string were to start with something like "<a small mouse once said".
There are various things I could do here. Frankly, I haven't much noticed because my use-case for this library so far has been small html snippets. But it's something I need to look in to.
Some parsing in this lib is written or refactored naively. For example, I'm not so sure that it was a net improvement to create a
text
decoder that used lookahead to be self-contained, or at the very least I need to do some performance inspection.One possible performance quagmire is how I've introduced html entity leniency, or how I accept, say, "<" as text instead of requiring it to be escaped "<". Off the top of my head this might result in massive backtracking if a string were to start with something like "<a small mouse once said".
There are various things I could do here. Frankly, I haven't much noticed because my use-case for this library so far has been small html snippets. But it's something I need to look in to.