openai / universe

Universe: a software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications.
https://universe.openai.com
MIT License
7.47k stars 959 forks source link

unable to accomplish "Run your first agent" #147

Closed plato360 closed 6 years ago

plato360 commented 7 years ago

(First, please check https://github.com/openai/universe/wiki/Solutions-to-common-problems for solutions to many common problems)

Expected behavior

expected to run sample code to verify install

Actual behavior

receive fatal error: unexpected signal during runtime execution [signal SIGSEGV: segmentation violation code=0x1 addr=0x110 pc=0x7f92780118f4] just after connecting to VNC Server. The Docker container seems to successfully start and stays running after python script crashes and I can start the container on it's own but am unable to link the script to it. I am able to access the VNC Viewer via a browser and can see the list of flashgames, however, when I click on one I get a page with the text (saying this is being played by an AI) but the game itself doesn't show. I was originally using go 1.6 when having this problem but in an earlier issue that looked similar to this it was suggested to upgrade to 1.7, and another issue on the go github suggested a fix was in 1.8. (I re-installed universe and the vnc driver each time I upgraded) Since I wasn't able to view the actual flashgames in the VNC I followed steps in #57 and was able to connect to github.com while downloading to LFS. running: docker run --entrypoint bash -ti --rm --privileged --ipc host --cap-add SYS_ADMIN quay.io/openai/universe.flashgames:0.20.15 -c "/usr/local/bin/sudoable-env-setup git-lfs flashgames.DuskDrive-v0; iptables --list" my output looked identical to what was to be expected except I lack all the UDP rules.

I've been at this for 2 days now and don't know what else to try.

Versions

Please include the result of running

$ uname -a ; python --version; pip show universe gym tensorflow numpy go-vncdriver Pillow

Linux plato-desktop 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Python 3.5.2 Name: universe Version: 0.21.2 Summary: Universe: a software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications. Home-page: https://github.com/openai/universe Author: OpenAI Author-email: universe@openai.com License: UNKNOWN Location: /home/plato/Documents/universe Requires: autobahn, docker-py, docker-pycreds, fastzbarlight, go-vncdriver, gym, Pillow, PyYAML, six, twisted, ujson

Name: gym Version: 0.7.3 Summary: The OpenAI Gym: A toolkit for developing and comparing your reinforcement learning agents. Home-page: https://github.com/openai/gym Author: OpenAI Author-email: gym@openai.com License: UNKNOWN Location: /home/plato/.local/lib/python3.5/site-packages Requires: pyglet, numpy, six, requests

Name: tensorflow Version: 0.12.1 Summary: TensorFlow helps the tensors flow Home-page: http://tensorflow.org/ Author: Google Inc. Author-email: opensource@google.com License: Apache 2.0 Location: /home/plato/.local/lib/python3.5/site-packages Requires: six, wheel, protobuf, numpy

Name: numpy Version: 1.12.0 Summary: NumPy: array processing for numbers, strings, records, and objects. Home-page: http://www.numpy.org Author: NumPy Developers Author-email: numpy-discussion@scipy.org License: BSD Location: /home/plato/.local/lib/python3.5/site-packages Requires:

Name: go-vncdriver Version: 0.4.19 Summary: UNKNOWN Home-page: UNKNOWN Author: UNKNOWN Author-email: UNKNOWN License: UNKNOWN Location: /home/plato/Documents/go-vncdriver Requires: numpy

Name: Pillow Version: 4.0.0 Summary: Python Imaging Library (Fork) Home-page: http://python-pillow.org Author: Alex Clark (Fork Author) Author-email: aclark@aclark.net License: Standard PIL License Location: /home/plato/.local/lib/python3.5/site-packages Requires: olefile

go version go1.8rc3 linux/amd64 go env GOARCH="amd64" GOBIN="" GOEXE="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOOS="linux" GOPATH="/home/plato/go" GORACE="" GOROOT="/usr/local/go" GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64" GCCGO="gccgo" CC="gcc" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build385760275=/tmp/go-build -gno-record-gcc-switches" CXX="g++" CGO_ENABLED="1" PKG_CONFIG="pkg-config" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2"

terminal_output.txt universe-30888.txt

tlbtlbtlb commented 7 years ago

I haven't seen this before. It may be that the go-glfw driver is getting an OpenGL configuration from the X server that it doesn't understand. From the same environment that you run universe in, can you run glxgears and does it show some spinning gears? What does glxinfo report?

Relevant part of terminal_output included below for searchability:

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x110 pc=0x7f9e5c1f18f4]

runtime stack:
runtime.throw(0x7f9e99f94ba8, 0x2a)
    /usr/local/go/src/runtime/panic.go:596 +0x97
runtime.sigpanic()
    /usr/local/go/src/runtime/signal_unix.go:274 +0x2df

goroutine 17 [syscall, locked to thread]:
runtime.cgocall(0x7f9e99f7bde9, 0xc420040cd8, 0xc42000e080)
    /usr/local/go/src/runtime/cgocall.go:131 +0xe8 fp=0xc420040ca8 sp=0xc420040c68
github.com/openai/go-vncdriver/vendor/github.com/go-gl/glfw/v3.2/glfw._Cfunc_glfwInit(0xc400000000)
    github.com/openai/go-vncdriver/vendor/github.com/go-gl/glfw/v3.2/glfw/_obj/_cgo_gotypes.go:1138 +0x4b fp=0xc420040cd8 sp=0xc420040ca8
github.com/openai/go-vncdriver/vendor/github.com/go-gl/glfw/v3.2/glfw.Init(0xc420040d60, 0x7f9e99d94480)
    /home/plato/Documents/go-vncdriver/.build/src/github.com/openai/go-vncdriver/vendor/github.com/go-gl/glfw/v3.2/glfw/glfw.go:32 +0x24 fp=0xc420040d18 sp=0xc420040cd8
github.com/openai/go-vncdriver/vncgl.SetupRendering.func1()
    /home/plato/Documents/go-vncdriver/.build/src/github.com/openai/go-vncdriver/vncgl/vncgl.go:39 +0x28 fp=0xc420040d70 sp=0xc420040d18
sync.(*Once).Do(0x7f9e9a3e8d18, 0x7f9e9a2f07d8)
    /usr/local/go/src/sync/once.go:44 +0xc0 fp=0xc420040da8 sp=0xc420040d70
github.com/openai/go-vncdriver/vncgl.SetupRendering()
    /home/plato/Documents/go-vncdriver/.build/src/github.com/openai/go-vncdriver/vncgl/vncgl.go:46 +0x3b fp=0xc420040dc8 sp=0xc420040da8
main.(*sessionInfo).initRenderer(0xc4200d0100, 0xc4203404e8, 0x1, 0xc4200e4048, 0x7f9e00000001)
    /home/plato/Documents/go-vncdriver/.build/src/github.com/openai/go-vncdriver/gl.go:16 +0x55 fp=0xc420040e10 sp=0xc420040dc8
main.GoVNCDriver_VNCSession_render(0x7f9eb3ec42c0, 0x7f9eb3db69e8, 0x0, 0x0)
    /home/plato/Documents/go-vncdriver/.build/src/github.com/openai/go-vncdriver/main.go:407 +0x162 fp=0xc420040e80 sp=0xc420040e10
main._cgoexpwrap_f4a5df30d895_GoVNCDriver_VNCSession_render(0x7f9eb3ec42c0, 0x7f9eb3db69e8, 0x0, 0x0)
    github.com/openai/go-vncdriver/_obj/_cgo_gotypes.go:626 +0x74 fp=0xc420040eb0 sp=0xc420040e80
runtime.call32(0x0, 0x7ffc3bac69c8, 0x7ffc3bac6a90, 0x20)
    /usr/local/go/src/runtime/asm_amd64.s:514 +0x4a fp=0xc420040ee0 sp=0xc420040eb0
runtime.cgocallbackg1(0x0)
    /usr/local/go/src/runtime/cgocall.go:301 +0x1a1 fp=0xc420040f58 sp=0xc420040ee0
runtime.cgocallbackg(0x0)
    /usr/local/go/src/runtime/cgocall.go:184 +0x86 fp=0xc420040fc0 sp=0xc420040f58
runtime.cgocallback_gofunc(0x0, 0x0, 0x0, 0x0)
    /usr/local/go/src/runtime/asm_amd64.s:767 +0x71 fp=0xc420040fe0 sp=0xc420040fc0
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc420040fe8 sp=0xc420040fe0
plato360 commented 7 years ago

glxgears works. I did manage to find the problem, though. Looking at glxinfo reminded me I had to change drivers to support my AMD rx480 GPU. It requires amdgpu-pro instead of the standard amdgpu. With amdgpu-pro uninstalled I was able to (mostly) run the demo program. I guess I will have to either find wait until it's supported or find a different system to play with universe on.

I said "mostly" because once the VNC started up with DuskDrive the mouse pointer just bounced back and forth from the play button to the top left corner of the game window, it never actually started playing. Eventually it would crash after about 10 seconds seemingly due to a buildup of requests. I haven't looked into this any more or checked if this may be a repeat of another issue. I'll probably try to set up on another machine and will post again if I have the issue there as well.

plato360 commented 7 years ago

In further consideration, perhaps I closed the issue to soon. Would you have any ideas for getting it to work with amdgpu-pro? I'm not too familiar with the inner workings of drivers but some ideas I've had are:

I also have access to a headless Ubuntu server but I have a feeling if I try on that it will complain about the lack of a display (desktop isn't installed)

mananSingh commented 7 years ago

Fatal error while running the first program.

While running the first agent (from "Getting Started" doc), tigervnc window opens and crashs.

go-vnc driver issue? "invalid request while reading client request line"? Tried go versions 1.6 as well as 1.7, but in vain.

universe-wz3cu7-0 | [2017-02-12 10:05:39,606] [INFO:universe.envs.vnc_env] [0] Connecting to environment: vnc://127.0.0.1:5900 password=openai. If desired, you can manually connect a VNC viewer, such as TurboVNC. Most environments provide a convenient in-browser VNC client: http://None/viewer/?password=openai
universe-wz3cu7-0 | [2017-02-12 10:05:39,606] [INFO:universe.extra.universe.envs.vnc_env] [0] Connecting to environment details: vnc_address=127.0.0.1:5900 vnc_password=openai rewarder_address=None rewarder_password=openai
universe-wz3cu7-0 | [2017-02-12 10:05:39,607] [INFO:root] [EnvStatus] Changing env_state: None (env_id=None) -> resetting (env_id=None) (episode_id: 0->1, fps=60)
universe-wz3cu7-0 | [2017-02-12 10:05:39,607] [INFO:root] [MainThread] Env state: env_id=None episode_id=1
universe-wz3cu7-0 | 2017/02/12 10:05:39 I0212 10:05:39.607689 56 gymvnc.go:417] [0:127.0.0.1:5900] opening connection to VNC server
universe-wz3cu7-0 | [tigervnc] 
universe-wz3cu7-0 | [tigervnc] Sun Feb 12 10:05:39 2017
universe-wz3cu7-0 | [tigervnc]  Connections: accepted: 127.0.0.1::49891
universe-wz3cu7-0 | [tigervnc]  SConnection: Client needs protocol version 3.8
universe-wz3cu7-0 | [tigervnc]  SConnection: Client requests security type VncAuth(2)
universe-wz3cu7-0 | [tigervnc]  VNCSConnST:  Server default pixel format depth 24 (32bpp) little-endian rgb888
universe-wz3cu7-0 | [tigervnc]  VNCSConnST:  Client pixel format depth 24 (32bpp) little-endian bgr888
universe-wz3cu7-0 | 2017/02/12 10:05:39 I0212 10:05:39.672782 56 gymvnc.go:550] [0:127.0.0.1:5900] connection established
universe-wz3cu7-0 | [Sun Feb 12 10:05:39 UTC 2017] [/usr/local/bin/sudoable-env-setup] Disabling outbound network traffic for none
universe-wz3cu7-0 | [2017-02-12 10:05:39,803] [INFO:gym_flashgames.launcher] [MainThread] Launching new Chrome process (attempt 0/10)
universe-wz3cu7-0 | [2017-02-12 10:05:39,803] [INFO:root] Replacing selenium_wrapper_server since we currently do it at every episode boundary
universe-wz3cu7-0 | [2017-02-12 10:05:40,313] [selenium_wrapper_server] Calling webdriver.Chrome()
universe-wz3cu7-0 | [2017-02-12 10:05:41,712] [INFO:universe.rewarder.remote] Client connecting: peer=tcp4:127.0.0.1:39500 observer=False
universe-wz3cu7-0 | [2017-02-12 10:05:41,713] [INFO:universe.rewarder.remote] WebSocket connection established
universe-wz3cu7-0 | [nginx] 2017/02/12 10:05:44 [info] 64#64: *9 client sent invalid request while reading client request line, client: 127.0.0.1, server: , request: "CONNECT www.google.com:443 HTTP/1.1"
universe-wz3cu7-0 | [nginx] 2017/02/12 10:05:44 [info] 64#64: *10 client sent invalid request while reading client request line, client: 127.0.0.1, server: , request: "CONNECT www.google.com:443 HTTP/1.1"

**fatal error: unexpected signal during runtime execution**
[signal 0xb code=0x1 addr=0x2c pc=0x7f6cd311cd1c]

runtime stack:
runtime.throw(0x7f6cd406d740, 0x2a)
    /usr/lib/go/src/runtime/panic.go:530 +0x92
runtime.sigpanic()
    /usr/lib/go/src/runtime/sigpanic_unix.go:12 +0x5e
runtime.reimburseSweepCredit(0x7f6c84000cf0)
    /usr/lib/go/src/runtime/mgcsweep.go:399 +0x52

goroutine 19 [syscall, locked to thread]:
runtime.cgocall(0x7f6cd38e23d9, 0xc82004b440, 0xc800000000)
    /usr/lib/go/src/runtime/cgocall.go:123 +0x121 fp=0xc82004b418 sp=0xc82004b3e8
github.com/openai/go-vncdriver/vendor/github.com/pixiv/go-libjpeg/jpeg._Cfunc_new_decompress(0x0)
    ??:0 +0x49 fp=0xc82004b440 sp=0xc82004b418
github.com/openai/go-vncdriver/vendor/github.com/pixiv/go-libjpeg/jpeg.DecodeIntoRGB(0x7f6cbc6198e0, 0xc8200bdc20, 0xc82004b618, 0x0, 0x0, 0x0)
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/vendor/github.com/pixiv/go-libjpeg/jpeg/decompress.go:256 +0x93 fp=0xc82004b560 sp=0xc82004b440
github.com/openai/go-vncdriver/vncclient.(*TightEncoding).Read(0xc82058a000, 0xc8200e2000, 0xc820084000, 0x7f6ccc01a168, 0xc8200c4018, 0x0, 0x0, 0x0, 0x0)
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/vncclient/encoding.go:659 +0xb89 fp=0xc82004b860 sp=0xc82004b560
github.com/openai/go-vncdriver/vncclient.(*FramebufferUpdateMessage).Read(0xc820570000, 0xc8200e2000, 0x7f6ccc01a168, 0xc8200c4018, 0x0, 0x0, 0x0, 0x0)
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/vncclient/server_messages.go:99 +0xa73 fp=0xc82004bca8 sp=0xc82004b860
github.com/openai/go-vncdriver/vncclient.(*ClientConn).mainLoop(0xc8200e2000)
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/vncclient/client.go:579 +0x968 fp=0xc82004bf98 sp=0xc82004bca8
runtime.goexit()
    /usr/lib/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc82004bfa0 sp=0xc82004bf98
created by github.com/openai/go-vncdriver/vncclient.Client
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/vncclient/client.go:115 +0x191

goroutine 5 [select]:
github.com/openai/go-vncdriver/gymvnc.(*VNCSession).connect(0xc820084140, 0xc820060480, 0x0, 0x0)
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/gymvnc/gymvnc.go:566 +0x1d24
github.com/openai/go-vncdriver/gymvnc.(*VNCSession).start.func1(0xc820084140, 0xc820060480)
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/gymvnc/gymvnc.go:149 +0x2d
created by github.com/openai/go-vncdriver/gymvnc.(*VNCSession).start
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/gymvnc/gymvnc.go:157 +0x8a

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
    /usr/lib/go/src/runtime/asm_amd64.s:1998 +0x1

goroutine 18 [syscall, locked to thread]:
runtime.goexit()
    /usr/lib/go/src/runtime/asm_amd64.s:1998 +0x1

goroutine 6 [chan receive]:
github.com/openai/go-vncdriver/gymvnc.(*VNCSession).start.func2(0xc820084140)
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/gymvnc/gymvnc.go:160 +0x51
created by github.com/openai/go-vncdriver/gymvnc.(*VNCSession).start
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/gymvnc/gymvnc.go:165 +0xac

goroutine 20 [select]:
github.com/openai/go-vncdriver/gymvnc.(*VNCSession).connect.func2(0xc8200c2060, 0xc820084140)
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/gymvnc/gymvnc.go:472 +0x288
created by github.com/openai/go-vncdriver/gymvnc.(*VNCSession).connect
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/gymvnc/gymvnc.go:477 +0xac3

goroutine 10 [select]:
github.com/openai/go-vncdriver/gymvnc.(*VNCSession).maintainFrameBuffer(0xc820084140, 0xc820060480, 0x0, 0x0)
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/gymvnc/gymvnc.go:359 +0x75d
github.com/openai/go-vncdriver/gymvnc.(*VNCSession).connect.func3(0xc820084140, 0xc820060480)
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/gymvnc/gymvnc.go:555 +0x2d
created by github.com/openai/go-vncdriver/gymvnc.(*VNCSession).connect
    /home/manan/data-science-machine-learning-code/openai/go-vncdriver/.build/src/github.com/openai/go-vncdriver/gymvnc/gymvnc.go:563 +0x172f

`

tlbtlbtlb commented 7 years ago

This is normal:

universe-wz3cu7-0 | [nginx] 2017/02/12 10:05:44 [info] 64#64: *9 client sent invalid request while reading client request line, client: 127.0.0.1, server: , request: "CONNECT www.google.com:443 HTTP/1.1"

I've seen crashes (but only rarely, every 10^2 hours of running) in Go's garbage collector with both 1.6 and 1.7.4 when calling CGO functions. Upgrading to 1.8 seemed to fix it. Or possibly, there is something wrong with the go-libjpeg library. Running it under a debugger is the best way to track it down, by running gdb python ...