ua-parser / uap-python

Python implementation of ua-parser
Apache License 2.0
561 stars 152 forks source link

Default to re2 parser is available #184

Closed masklinn closed 8 months ago

masklinn commented 8 months ago

After benchmarking, the results are out, at least on the current sample file:

First, re2 is ridiculously faster than the basic parser, even with tons of caching. re2 does benefit from caching, but it's so fast that it needs very high hitrates (so a very large cache) for the caching to have a real impact, it's fast enough that at low hitrates (small sizes) the cache does slow down parsing visibly which is not the case of the basic parser.

Second, LRU is confirmed to be a better cache replacement policy than clearing (which... duh), it's not super sensible at very low sizes but at 100 entries it starts really pulling ahead, so definitely the better default at 200 (where even with the overhead of the more layered approach it's ahead of the legacy parser and its immutable 20 entries clearing cache).

The locking doesn't seem to have much impact without contention, and even contended the LRU seems to behave way better than the clearing cache still. So fallback onto locked LRU if re2 is not available.