ebitengine / purego

Apache License 2.0
2.16k stars 68 forks source link

WIP: PoC of using a bit of generics and less reflection #172

Open Zyko0 opened 12 months ago

Zyko0 commented 12 months ago

Not meant as a merge candidate, just a PoC

Context: ebitengine#purego (discord server)

This is a proof of concept of a generic alternative for RegisterFunc as functions ranging from RegisterFunc0_0 (0 input, 0 output) to RegisterFunc9_1 (9 inputs, 1 output). The goal is to remove a good amount of reflection where it's not needed, but also to remove the need of reflect.MakeFunc which returns a heavy function that has plenty of overhead at runtime.

There are tests and benchmarks for a few functions of the libc in func_test.go

Difference for strlen:

Syscall9:       11762629 - 100.5 ns/op - 32 B/op  - 1 allocs/op
RegisterFunc:    2411634 - 490.4 ns/op - 120 B/op - 6 allocs/op
RegisterFunc11:  7690965 - 157.0 ns/op - 176 B/op - 2 allocs/op

Notes:

Benchmark results on my machine (Windows)
goos: windows
goarch: amd64
pkg: github.com/ebitengine/purego
cpu: AMD Ryzen 7 3800X 8-Core Processor
Benchmark_NewCallBack/RegisterFunc(original)-16              1304842           918.7 ns/op       328 B/op         12 allocs/op
Benchmark_NewCallBack/RegisterFunc9_1(new)-16                3216594           373.6 ns/op       144 B/op          1 allocs/op
Benchmark_qsort/RegisterFunc(original)-16                     599888          2022 ns/op         264 B/op          6 allocs/op
Benchmark_qsort/RegisterFunc1_0(new)-16                       666532          1788 ns/op         296 B/op          4 allocs/op
Benchmark_strlen/RegisterFunc(original)-16                   2451050           491.5 ns/op       120 B/op          6 allocs/op
Benchmark_strlen/RegisterFunc1_1(new)-16                     7641933           158.3 ns/op       176 B/op          2 allocs/op
Benchmark_strlen/SyscallN-16                                 8331891           143.2 ns/op       112 B/op          2 allocs/op
Benchmark_cos/RegisterFunc(original)-16                      3273643           367.5 ns/op        64 B/op          4 allocs/op
Benchmark_cos/RegisterFunc1_1(new)-16                        9373315           126.0 ns/op       144 B/op          1 allocs/op
Benchmark_cos/Go-16                                         137118896            8.760 ns/op           0 B/op          0 allocs/op
Benchmark_isupper/RegisterFunc(original)-16                  3233931           371.6 ns/op        56 B/op          4 allocs/op
Benchmark_isupper/RegisterFunc1_1(new)-16                   10167706           120.4 ns/op       144 B/op          1 allocs/op
MatejMagat305 commented 6 months ago

is this plan or only example and hypotetical idea?

TotallyGamerJet commented 6 months ago

is this plan or only example and hypotetical idea?

It is just a suggestion to improve performance. There is currently no plans to integrate the ideas in this PR. However that could change in the future