Open esfalsa opened 1 year ago
My test runs: System: M1 Macbook Pro (2020, 8 core CPU/8 core GPU/16 core Neural Engine) / 8GB RAM / 256GB SSD, MacOS Ventura 13.1
System: 4 AMD EPYC 7551 cores / 24GB RAM / 40GB drive Ubuntu 20.04.6 LTS aarch64
I'd drag out my Windows laptop to run these tests on as well but I feel like that might be a little redundant. On higher-powered systems there's a 1.5-1.9 second speedup, which while not amazing is certainly better than no speedup at all. Where this PR shines though is lower end systems where CPU is going to be a bottleneck -- shaving up to four seconds off on the run with 4 CPU cores on a VM. It definitely makes the process more tolerable.
This is worth code review.
This PR replaces openpyxl with XlsxWriter.
Performance
Most benchmarks I can find suggest XlsxWriter should have better performance than openpyxl, at least for writing data. (The benchmarks on openpyxl's own documentation would suggest this, too.)
Here are some quick benchmarks from my local machine (2020 MacBook Air with a M1 processor). The difference for me is just a few seconds, so it's not quite to the level some benchmarks on the internet would suggest, but it includes parsing the daily dump as well (although excludes downloading it).
openpyxl
XlsxWriter
XlsxWriter apparently uses less CPU as well, but I didn't really find that advertised as a benefit much from my internet searches so it might have just been a fluke with whatever else was running on my computer at the time.
File Size
Incidentally, XlsxWriter seems to produce a smaller output file as well, at least for today's daily dump.
openpyxl
XlsxWriter