tinygo-org / tinygo

Go compiler for small places. Microcontrollers, WebAssembly (WASM/WASI), and command-line tools. Based on LLVM.
https://tinygo.org
Other
15.33k stars 905 forks source link

Slower than JavaScript when using WebAssembly #4084

Closed guiferpa closed 9 months ago

guiferpa commented 9 months ago

Context

Currently I'm studying about wasm and for archive this goal it'll build a minimalist image editor.

Source code repository address: https://github.com/guiferpa/tinygo-wasm

Problem

Developing this challenge I've considered have a performance measuring for each function that apply the filter into image using wasm and js. Given that I figured out that wasm is slower than js.

Consideration

I'm using unsafe package for develop wasm function but I hope that's not the problem.

Mensuration

Screenshot 2024-01-12 at 14 17 26
soypat commented 9 months ago

Went ahead and tried my hand at optimizing the Go filter on amd64 and managed a ~25% increase in speed, maybe it transfers to WASM?

changed code

The principle applied for optimizing is slow divide avoidance. Usually dividing by a constant which is not a power-of-two is one of the slower operations on modern CPUs. We can avoid 2 of the 3 divides per loop iteration by accumulating into one uint16 and then dividing the result.

func applyBlackAndWhiteFilter(p *uint, size int) {
    up := uintptr(unsafe.Pointer(p))

    i := 0
    for i < size {
        ap0 := (*byte)(unsafe.Pointer(up + uintptr(i+0)))
        ap1 := (*byte)(unsafe.Pointer(up + uintptr(i+1)))
        ap2 := (*byte)(unsafe.Pointer(up + uintptr(i+2)))

        filter := byte((uint16(*ap0) + uint16(*ap1) + uint16(*ap2)) / 3)

        *ap0 = filter
        *ap1 = filter
        *ap2 = filter

        i += 3
    }
}

Results

goos: linux
goarch: amd64
pkg: github.com/soypat/seqs/local
cpu: 12th Gen Intel(R) Core(TM) i5-12400F
BenchmarkBWFilterCanon-12         213970              5056 ns/op
BenchmarkBWFilter-12              289238              3805 ns/op
PASS
ok      github.com/soypat/seqs/local    2.284s
guiferpa commented 9 months ago

The problem it was always me 😅. I was getting elapsed time by wrong way. By the way many thanks for your time and help.