Open ndmitchell opened 9 years ago
Probably because uncons of a bytestring in less efficient than that of string.
Yes, but really it need a significant rewrite. I should probably open a ticket with the going forward plans for tagsoup and the parser...
From https://code.google.com/p/ndmitchell/issues/detail?id=290
vasyl said: I've used following attached code for benchmark, the "page.html" could be arbitrary page, for example hackage packages list.
On my PC, String version of tagsoup executes in 132 ms, and ByteString in 453 ms.
IMO this behavior is bad, because everyone suspect, that ByteString should be faster. I think the best way is to disable bytestrings for now, because converting BS to String is faster anyway (the last benchmark)
@ndmitchell replied:
Hmm, there is a benchmark in tagsoup, and I found them to be the same speed. The reason I included ByteString is that it takes less memory, which does matter for some applications.
I'll see how your benchmarks differ, and combine them in to mine. Tagsoup-0.8 was intended to be an interface release, with Tagsoup-0.9 providing speed. With any luck I'll have ByteString going substantially faster in the next release.