Open ammarahm-ed opened 9 months ago
We should do new benchmarks when considering something like this, FWIW.
Using a custom allocator is problematic for libquickjs.a due to:
risk of symbol conflicts in the embedding program unless the allocator is completely namespaced off
allocator mixup bugs (allocating with allocator A, releasing with B); maybe not a problem for quickjs but js_malloc and friends are exposed in quickjs.h
Embedders can replace the default allocator. That's something we could do in the qjs executable, although I'm in no rush to do that right now.
Tangentially, I've spotted several places where quickjs mallocs a struct that mallocs a struct that mallocs a struct. There's likely gains to be had by flattening such allocation trees.
Using a custom allocator is problematic for libquickjs.a due to:
* risk of symbol conflicts in the embedding program unless the allocator is completely namespaced off * allocator mixup bugs (allocating with allocator A, releasing with B); maybe not a problem for quickjs but js_malloc and friends _are_ exposed in quickjs.h
Embedders can replace the default allocator. That's something we could do in the qjs executable, although I'm in no rush to do that right now.
Tangentially, I've spotted several places where quickjs mallocs a struct that mallocs a struct that mallocs a struct. There's likely gains to be had by flattening such allocation trees.
mi_malloc
, mi_free
etc.js_malloc
even now, I expect them to free with js_free
and internally these apis will switch to mimalloc
so if someone does end up allocating memory with malloc and then trying to free with js_free, it's on them I would say haha.There are indeed some optimisations we could do with the default allocators too but i think the difference is just huge when using mimalloc.
@saghul So I did some benchmarks with current quickjs-ng
Benchmark | Master | Mimalloc | Percent Improvement |
---|---|---|---|
DeltaBlue | 605 | 702 | 15.99% |
Crypto | 863 | 906 | 4.98% |
RayTrace | 357 | 963 | 169.33% |
EarleyBoyer | 682 | 1579 | 131.70% |
RegExp | 149 | 220 | 47.65% |
Splay | 736 | 1804 | 145.76% |
SplayLatency | 3088 | 4543 | 47.16% |
NavierStokes | 1511 | 1590 | 5.24% |
PdfJS | 2016 | 2949 | 46.21% |
Mandreel | 604 | 656 | 8.61% |
MandreelLatency | 4803 | 5097 | 6.13% |
Gameboy | 5191 | 5644 | 8.72% |
CodeLoad | 8015 | 13343 | 66.20% |
Box2D | 1905 | 2632 | 38.03% |
Typescript | 7088 | 9413 | 32.75% |
Score (version 9) | 1430 | 2060 | 43.98% |
That's a very significant difference. Another allocator worth trying out is jemalloc.
Perhaps as an optional dependency? Or an allocator system that lets you add a define to switch which malloc you're using?
Already the case:
Embedders can replace the default allocator.
Even though you can replace the default allocator yourself which i have done already, making it default positions quickjs at a much better place in terms of performance for everyone and in various benchmarks. That's why i opened up this issue.
While having good benchmark results it would be disingenuous to do that if it turns out it's tricky to use that allocator in all situations.
That said, once I put the docs together, we can have a section where we mention things like this.
@ammarahm-ed I want to benchmark something semi-related to this (using calloc vs malloc + memset to be precise). How did you get the results table from your first post?
Using mimalloc instead of malloc improves performance significantly.
Pros: Improved performance with a small change
Cons: It will add an external dependency and slightly increase size of QuickJS binaries.
It has been previously added here: https://github.com/openwebf/quickjs/pull/30
Here's the benchmark I ran after replacing:
It could also be opt-in by the way.