Open anarcat opened 2 years ago
oh and in case you're wondering why this matters to me, it's because i wrote this tool called undertime who gives you different times in different zones, as a one-shot commandline tool. most of its time is spent building those regexes it doesn't use. :)
i'm now lazily loading dateparser itself, but the user can definitely "feel" when it hits that corner case.
We use a lot of data objects in our libraries that usually load from json and moving them to lazy-load instead of load-on-import has been helpful. It's not too hard and it's been reliable for us.
hi!
first, thanks for this awesome project, it's really useful and powerful and i am grateful to not have to write this stuff myself. :)
i open this issue because I feel there's some inherent performance issue to be paid whenever we even load the dateparser library:
compare with similar libraries:
a quick profiling seems to show it spends an inordinate amount of time compiling regular expressiongs:
basically, it seems we're spending a lot of time compiling regular expressions. individually, those don't matter so much (percall=1ms) but we seem to be doing hundreds of those. I think it might be related to the
timezone_parser.py
file (build_tz_offsets
?) but i stopped digging there.the exact source is a little besides the point: shouldn't just importing the module be safe enough, performance wise? i know we load a default parser, but that's not what's eating us here, but rather a bunch of globals in
timezone_parser.py
... it seems to me those could be lazily loaded, at least?