Closed tobowers closed 3 years ago
/cc @neelance
It works fine in Chrome and Firefox, so I think this needs a fix on Safari's end. Unfortunately Safari is not investing much in proper support for WebAssembly.
It works fine in Chrome and Firefox, so I think this needs a fix on Safari's end. Unfortunately Safari is not investing much in proper support for WebAssembly.
This seems like a reasonable theory, but it's also a call stack thing where the function is calling itself... so it's also possible that chrome/node are just covering up a bug too. I don't know the internals of the wasm compilation very well, but if I can be of help trying to track this down I will.
The most likely explanation is that we're hitting some implementation-specific limit and Safari has the lowest limit. Unfortunately the WebAssembly spec does not specify concrete limits:
Concrete limits are usually not fixed but may be dependent on specifics, interdependent, vary over time, or depend on other implementation- or embedder-specific situations or events.
Source: http://webassembly.github.io/spec/core/appendix/implementation.html?highlight=limits#execution
Like I said, Safari is in general not an engaged player in the WebAssembly ecosystem, which is why I am inclined to say that the incompatibility should be fixed by Safari, not Go.
I agree. Exactly what I said in slack. It is a combination of the regex and Safari implementation. The regex itself generates recursive function calls, so there isn't much the compiler can do. And Safari has this weird limit on call stack size.
FWIW: I tested the max call stack in WASM on chrome and safari
Safari is about 11,300 and chrome is 39,400
Over 11k function calls for a regex compile seems like a lot?
Using the following go code:
package main
import (
"fmt"
)
func recurse(i int) {
if (i % 100) == 0 {
fmt.Println(i)
}
recurse(i + 1)
}
func main() {
fmt.Println("hello from go")
recurse(0)
}
Yeah, ideally the regexp package wouldn't require 11k stack frames.
Specifically, at a7042249ab,
...
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a95e0, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc00013e4f0 sp=0xc00013e288 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9650, 0x546000002a3)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc00013e758 sp=0xc00013e4f0 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a96c0, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc00013e9c0 sp=0xc00013e758 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9730, 0x544000002a2)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc00013ec28 sp=0xc00013e9c0 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a97a0, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc00013ee90 sp=0xc00013ec28 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9810, 0x542000002a1)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc00013f0f8 sp=0xc00013ee90 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9880, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc00013f360 sp=0xc00013f0f8 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a98f0, 0x540000002a0)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc00013f5c8 sp=0xc00013f360 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9960, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc00013f830 sp=0xc00013f5c8 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a99d0, 0x53e0000029f)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc00013fa98 sp=0xc00013f830 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9a40, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc00013fd00 sp=0xc00013fa98 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9ab0, 0x53c0000029e)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc00013ff68 sp=0xc00013fd00 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9b20, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc0001401d0 sp=0xc00013ff68 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9b90, 0x53a0000029d)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc000140438 sp=0xc0001401d0 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9c00, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc0001406a0 sp=0xc000140438 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9c70, 0x5380000029c)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc000140908 sp=0xc0001406a0 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9ce0, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc000140b70 sp=0xc000140908 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9d50, 0x5360000029b)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc000140dd8 sp=0xc000140b70 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9dc0, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc000141040 sp=0xc000140dd8 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9e30, 0x5340000029a)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc0001412a8 sp=0xc000141040 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9ea0, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc000141510 sp=0xc0001412a8 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9f10, 0x53200000299)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc000141778 sp=0xc000141510 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004a9f80, 0x2)
/home/bradfitz/go/src/regexp/syntax/compile.go:156 +0xb29 fp=0xc0001419e0 sp=0xc000141778 pc=0x45e3e9
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004aa000, 0x53000000298)
/home/bradfitz/go/src/regexp/syntax/compile.go:146 +0xdaf fp=0xc000141c48 sp=0xc0001419e0 pc=0x45e66f
regexp/syntax.(*compiler).compile(0xc00016dda8, 0xc0004aa070, 0x2)
...
Doesn't that imply this is more of a regex issue than a wasm issue ?
I agree, there should be a way to not require 11k stack frames. Seems like I was a bit too quick with saying that there is nothing to change on Go's side...
Is there anyone in particular we should ping who is most familiar with the regexp/syntax package?
paging @rsc as per owners.
Kinda both.
regexp/syntax could be more economical in its use of stack frames. (that'd be up to @rsc to fix/approve changes)
But maybe wasm could handle it better. It works fine on other architectures. Do we know specifically which limit Safari is unhappy about? Is it really call depth, or we hitting memory limits?
But maybe wasm could handle it better. It works fine on other architectures. Do we know specifically which limit Safari is unhappy about? Is it really call depth, or we hitting memory limits?
I mean it says: "maximum call stack exceeded" which I think is only recursion.
I used to get an out of memory error in mobile, but go 1.13 seems to have fixed that.
The biggest offender here are the recursive calls to compile
under the OpConcat
and OpAlternate
labels. I'm currently trying to see the impact of converting the function to use a stack in a loop (with minimal changes), but compile
's current behavior allows it to revisit a node, so a standard DFS won't do.
@smasher164, thanks!
FWIW - I found that it choked on any regex with a count in it... like: {1,256}
etc
@smasher164 what'd you find out there? This issue is blocking an app I'm working on from loading in Safari
@joeykrug I'm afraid I haven't had time to look at this recently. I did find that converting regexp's (RE2's) compile procedure from a recursive one to an iterative one would have to be more invasive than just using a stack-based DFS. Additional state needs to be tracked to control which subexpressions are processed, especially for the loops inside OpConcat
and OpAlternate
.
I may get back to this in the near term, but if someone else wants to work on it, feel free. Additionally, it may be worthwhile to first check whether leaf functions like inst
and patch
are being inlined, and if that sufficiently reduces the number of stack frames.
This issue appears to be resolved in the recent Safari Technology Preview (release 100).
I am wondering what was the solution from their side ? Because clearly there is something that can be improved in the compiler.
Judging by the patch notes in Bug 206436
This patch adds a simple stack slot allocator to Air O0 to make code
use smaller stack frames. The huge stack frames from the old stack
allocator were leading to stack overflows in some programs. Before,
each Tmp got its own stack slot. The new allocator works similar to O0's
register allocator. This stack allocator linearizes the program and uses live
range end as an opportunity to place the stack slot on a free list of
available stack slots. This patch also fixes an issue in our linearization code
where the head of a block and the tail of another block would share the
same linearization index. This didn't matter for register allocation, but
does matter for the stack allocator. So "live at head", and "live at tail"
now get their own linearization index.
It seems that webkit now reduces the size of a stack frame to allow for a greater stack depth. This addresses the stack depth limit described in Bug 201028
Max Call Stack Depth | Max Stack Size |OS | Browser | Device
11,000+ | 5242848 |iOS 12.3.1 | Safari | iPad Pro 10.5”
900+ | 5242848 |iOS 13.2 | Safari | iPad Pro 12.9”
900+ | 5242848 |iOS 13.2 | Chrome | iPad Pro 12.9”
17,800+ | 5242848 |macOS 10.15.1| Chrome | MacBook
16,700+ | 5242848 |macOS 10.15.1| Safari | MacBook
The issue seems to be resolved since Safari 13.2. What remains is a possible performance improvement of regexp/syntax. @bradfitz Do we want to relabel this issue to keep track of it or just close it?
@neelance I think this issue should be closed, since it was a WASM implementation issue. Resolving this on the Go side would require significant refactoring of the regexp package.
I'm going ahead and closing it, based on the lack of response. Feel free to reopen if necessary.
What version of Go are you using (
go version
)?go 1.13 in docker (compiled to wasm)
Does this issue reproduce with the latest release?
yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Simple reproduction here: https://xena.greedo.xeserv.us/files/regex_wasm/ Simply including
var labelRegex = regexp.MustCompile(
^\s([[:ascii:]]{1,256}?)=("[[:ascii:]]{0,256}?")\s,)
in the main.go causes the go not to run in safari.Note, this does work in chrome and node (and I believe FireFox) and this is only an issue in the safari browser (both desktop and mobile).
What did you expect to see?
A compiled regex
What did you see instead?
Maximum call stack exceeded.