go-qml / qml

QML support for the Go language
Other
1.96k stars 189 forks source link

wrong goroutine in newstack #43

Open tummychow opened 10 years ago

tummychow commented 10 years ago

Platform is an Arch Linux 32-bit virtual machine (opengl 3d hardware acceleration is on and glxinfo confirms it is working). Running the following commands to set up dependencies and test the installation.

sudo pacman -S qt5 pkg-config
go get -u github.com/niemeyer/qml
cd $GOPATH/src/github.com/niemeyer/qml/examples/particle
GOTRACEBACK=2 go run main.go &> log

Basically, install dependencies and run the particle example. Go 1.2 is installed. (note to others: you may need to correct a typo in /usr/share/X11/locale/en_US.UTF8/Compose from "actute" to "acute", otherwise you get a keysym error.) go get succeeds without incident so it looks like all the headers are in place.

Then go run results in this log: https://gist.github.com/tummychow/9046195 Haven't seen anything like this in previous issues. Suggestions?

niemeyer commented 10 years ago

Can you provide a small example (or any example, if a small one isn't feasible) reproducing this? I'd be happy to have a look at it.

tummychow commented 10 years ago

Well when I get issues with examples, I'm inclined to assume it's a problem with my setup. I ran the other examples to see what happened, here's the damage report:

Guessing from this and the crash output, there's something going wrong with CreateWindow, so I started paring down the particle example. I was able to reproduce the issue even with this: https://gist.github.com/tummychow/9057559

I'm not really sure where to go from here as the issue may well be specific to my machine. Is there anything you want me to try that might shed more light on the problem? If the issue is with dependencies, I can easily try to install them in a different way.

niemeyer commented 10 years ago

Sorry, I missed the fact this was happening even with stock examples in my first read. Indeed, that's curious and unexpected.

Where did you get your Go 1.2 installation from? Would you mind to do an attempt with the upstream binaries to see if it makes a difference?

tummychow commented 10 years ago

Came from official arch package, I'll rewind my VM and install from source and see what happens.

tummychow commented 10 years ago

No dice, installing from source following these instructions and then testing the example again gave the same crash. Other than installing go from source (to /usr/lib/go which is what I think the official package does), the commands I used were the same as before. Might as well try the precompiled binaries too; will report back shortly.

edit: no luck there either.

niemeyer commented 10 years ago

Thanks for these tests. I'll investigate this further and let you know.

tummychow commented 10 years ago

I was bored and decided to do some more experimentation, got it working on a different virtual machine. Here are the details of the working installation:

Arch Linux 64-bit (perhaps the architecture made a difference? personally I doubt it but I have no evidence) From a raw arch install (basically bare minimum to have a working machine), the following packages/package groups were added:

And in this environment, running the particle example worked. I have not tested yet, but I think it's a safe guess that the other examples work too.

Obviously the problem is with my main virtual machine, I'll see if I can find out why and report back.

tummychow commented 10 years ago

Hah. Well I was wrong.

I transitioned my entire virtual machine (the one that had exhibited the original error) to 64-bit architecture (VBoxManage modifyvm ARCH --longmode on) and then reinstalled all the packages from 64-bit repositories using these instructions. Then I installed qt5 again and ran go get. Running the examples this time worked without a hitch. I checked a few other examples and they worked too. Spinning gophers, particle effects, they all worked.

I'm not really clear on why this code is 64-bit dependent, but I hope that discovery helps your investigation. Let me know if there are any other tests you want me to try. For now, I'm just glad it works.

tummychow commented 10 years ago

One more thing: I made another fresh arch install on a 32-bit VM and ran the same commands as for the clean 64-bit VM, and was able to reproduce the wrong goroutine error on the examples, as with my original report. It appears that the error occurs on 32-bit Arch Linux VMs, but not 64-bit.

niemeyer commented 10 years ago

This sounds like an issue in the Go runtime, perhaps related to the interaction with cgo calling back into Go code.. I'll see if I can find some time to reproduce the issue locally.

nieware commented 10 years ago

I am getting the "wrong goroutine in newstack" error message in my 32-bit Kubuntu VirtualBox VM too, and in my case it has something to do with OpenGL not working properly. The easiest solution I found for this was to enable software rendering (enter "export LIBGL_ALWAYS_SOFTWARE=1" to enable it for the current console session). It is slower than hardware-accelerated OpenGL and does not look as pretty, but at least the gophers are spinning ;)

Edit: I just saw that you already ruled out OpenGL as a cause. Is it just glxinfo that says it's working or have you run a "real" OpenGL application successfully?

niemeyer commented 10 years ago

There is likely an interaction with OpenGL playing a role, but just because the Qt renderer runs on its own thread. In either case, the Go runtime should not blow up because two different threads are calling back into it, as I understand it.

niemeyer commented 10 years ago

@nieware @tummychow Can you try to reproduce the issue with Go from tip (1.3 dev)?

tummychow commented 10 years ago

Sure, I'll set up a 32-bit vm to test this. For what it's worth, I recall testing with both software rendering (virtualbox video drivers off) and hardware accelerated rendering (vbox video drivers on), but I'll try it out once I have a clean 32-bit environment to be certain.

tummychow commented 10 years ago

Alright so I picked up tip with gvm and compared it to 1.2.1. This is all on a 32-bit Arch Linux VM which I just opened up half an hour ago. I built particle with both versions, and both versions failed with the "wrong goroutine in newstack" error (whether with software rendering or with hardware rendering). So tip also has this issue.

I'll keep the VM around if there's anything else you'd like me to try.

tummychow commented 10 years ago

Oh and @nieware I was just using glxinfo to check if the virtualbox 3d video hardware accelerated driver was loaded properly. At the time that I first opened this issue, I was using a 32-bit Arch machine with the hardware accelerated video turned off, so when I first encountered the problem, I tried turning it on and then I ran glxinfo to make sure it was enabled properly.

To be quite honest I don't know if any of the programs I use regularly on this machine are "really" openGL. I do recall that I ran qmlscene on one of the example QML files and it worked (back on the 32-bit machine), but the example Go program of the same name did not. I'm assuming that the Qt qmlscene binary uses openGL for rendering, so I guess that counts as a "real" openGL application /shrug

niemeyer commented 10 years ago

Are you able to upload that machine image somewhere?

tummychow commented 10 years ago

It's 2.3GB so I'm not sure if there's anywhere convenient to put it. I can give you steps to reproduce the machine, given a clean arch linux installation (base base-devel grub virtualbox-guest-utils), or if you have somewhere I can upload it, I'll send it there.

EDIT: actually let me see if I can reproduce the issue with LXDE, that might make the image a more manageable size

EDIT: base base-devel groups are already 784MB in size, I doubt it can be shrunk any further than that and still be able to compile Go tip from source

EDIT: let's see if I can upload it to dropbox

niemeyer commented 10 years ago

If you use the following AWS credentials, you can upload any file to the bucket qml-issue-43 using your preferred S3 client:

redacted

Please let me know the SHA1 of the file you send there, to make sure I'll be looking at the same content you have.

tummychow commented 10 years ago
$ sha1sum querulous.vdi
c639d5eb50d164c6a7851c0b0c11fe14cb3e2cb5 *querulous.vdi

Upload to s3 should be done in 2-3 hours, assuming I did it right (terrible home internet with its terrible upload speeds, and I keep getting access-denied errors when I try to list buckets. I'm new to S3 so I guess this key doesn't have permissions for that. Uploading seems to work, but let me know if I made a stupid mistake).

It's a VDI disk image, standard format for virtualbox. The image should boot cleanly if you create a new VM with operating system "Arch Linux (32-bit)" and select the image as a storage device (3d acceleration is off by default, which should still reproduce the error).

Once the virtual console comes up, log in as sjung and startxfce4. Then you can open the standard xfce terminal and use gvm to pull up the installed versions of go:

$ gvm use tip # or 1.2.1
$ cd $GOPATH/src/gopkg.in/qml.v0/examples/particle # set by gvm
$ go build -o particle main.go
$ ./particle
# wrong goroutine error should occur

edit: welp my s3 client told me "access denied" right when the upload finished... I can't seem to list the contents of the bucket so I have no clue if it worked or not. Let me know if the upload failed.

niemeyer commented 10 years ago

Thank you very much. I'll have a look at it.

zmb3 commented 10 years ago

I'm running into the same issue with Go 1.2.2 installed from the binary distributions on golang.org. 32 bit Linux Mint Debian. Software rendering works (as mentioned).

imheresamir commented 9 years ago

I have the same issue on the ARM platform, Go 1.3.3. @niemeyer did you manage to figure out why this is happening? There is no problem on x64. Please see this traceback: https://gist.github.com/imheresamir/7b10a9b50ff080a2f907

niemeyer commented 9 years ago

I still haven't managed to reproduce this issue with my own setup, and haven't yet managed to go through the provided image which encapsulates a setup that should reproduce (sorry). That said, quite a few things have been changed and improved on that area in Go 1.4, including better error reporting in some cases. Can you please try the latest 1.4 release candidate and let me know how it goes?

imheresamir commented 9 years ago

I compiled and built go1.4 for arm6h (raspberrypi), and under this new release the traceback doesn't show anymore on crash; rather, the program just segfaults and dies.

What I'm trying to do is very simple, please see https://gist.github.com/imheresamir/3be081b17e0366fdab73

It crashes when attempting a syscall (in this case originating from the os/exec function). This seems to happen when the goroutine's stack needs to be expanded with morestack() (I think).

If the goroutine (and therefore the syscall) are removed, the qml app works as expected. If go-qml is no longer invoked, then the os/exec call works as expected.

I recall from https://groups.google.com/forum/#!topic/golang-dev/7my0GY5yXUU that Russ mentioned something about go-qml modifying the go stack in a certain way.

I am trying to decipher the cdata files in go-qml. I'm not sure where to begin but I'm determined to dig out a solution.

I am going to see if debugging with gdb can offer up any clues.

Perhaps you could assist me so we can increase the robustness of this cool library, especially on ARM and other platforms!