OpenFactorioServerManager / factorio-server-manager

A tool to help manage Factorio multiplayer servers including mods and save games.
MIT License
541 stars 130 forks source link

[Ubuntu 20.10] FSM crashes when accessing mods page #261

Closed gaultx closed 3 years ago

gaultx commented 3 years ago

Running on Ubuntu 20.10. Nav'ing to other pages is fine, but as soon as I click on the mods page, the app crashes with this output on console.

root@vultr:~# cd /opt/factorio-server-manager/
root@vultr:/opt/factorio-server-manager# ./factorio-server-manager --dir /opt/factorio --port 8080
/opt/factorio-server-manager/mod_packs
2021/02/18 23:52:49 Loaded Factorio settings from /opt/factorio/config/server-settings.json
2021/02/18 23:52:49 Starting server on: 0.0.0.0:8080
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x15 pc=0x7fbce46e80c4]

runtime stack:
runtime.throw(0xbe52ae, 0x2a)
        /opt/hostedtoolcache/go/1.14.14/x64/src/runtime/panic.go:1116 +0x72
runtime.sigpanic()
        /opt/hostedtoolcache/go/1.14.14/x64/src/runtime/signal_unix.go:701 +0x46a

goroutine 20 [syscall]:
runtime.cgocall(0x92f840, 0xc00003edc0, 0xc0000118b0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/runtime/cgocall.go:133 +0x5b fp=0xc00003ed90 sp=0xc00003ed58 pc=0x40b09b
net._C2func_getaddrinfo(0xc000211e00, 0x0, 0xc000273920, 0xc0000118b0, 0x0, 0x0, 0x0)
        _cgo_gotypes.go:92 +0x55 fp=0xc00003edc0 sp=0xc00003ed90 pc=0x5daed5
net.cgoLookupIPCNAME.func1(0xc000211e00, 0x12, 0x12, 0xc000273920, 0xc0000118b0, 0x0, 0xc00003eea0, 0x5de2b2)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/cgo_unix.go:161 +0xd2 fp=0xc00003ee08 sp=0xc00003edc0 pc=0x5e0932
net.cgoLookupIPCNAME(0xbceb5c, 0x3, 0xc000211d00, 0x11, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/cgo_unix.go:161 +0x183 fp=0xc00003ef18 sp=0xc00003ee08 pc=0x5dc363
net.cgoIPLookup(0xc00027cba0, 0xbceb5c, 0x3, 0xc000211d00, 0x11)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/cgo_unix.go:218 +0x67 fp=0xc00003efb8 sp=0xc00003ef18 pc=0x5dca97
runtime.goexit()
        /opt/hostedtoolcache/go/1.14.14/x64/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc00003efc0 sp=0xc00003efb8 pc=0x46ab61
created by net.cgoLookupIP
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/cgo_unix.go:228 +0xc7

goroutine 1 [IO wait]:
internal/poll.runtime_pollWait(0x7fbce47b5f38, 0x72, 0x0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/runtime/netpoll.go:203 +0x55
internal/poll.(*pollDesc).wait(0x7, 0x72, 0x0, 0x0, 0x0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*FD).Accept(0xc0000d9980, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/internal/poll/fd_unix.go:377 +0xee
net.(*netFD).accept(0xc0000d9980, 0x20b4d51306f4ad22, 0x0, 0x20b4d51306f4ad22)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/fd_unix.go:238 +0x42
net.(*TCPListener).accept(0xc0001eff60, 0x602efdd5, 0xc0000e1a10, 0x4cd296)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/tcpsock_posix.go:139 +0x32
net.(*TCPListener).Accept(0xc0001eff60, 0xc0000e1a60, 0x18, 0xc000000300, 0x6ec9dc)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/tcpsock.go:261 +0x64
net/http.(*Server).Serve(0xc0000cc2a0, 0xcb2be0, 0xc0001eff60, 0x0, 0x0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/server.go:2930 +0x25d
net/http.(*Server).ListenAndServe(0xc0000cc2a0, 0xc0000cc2a0, 0x7)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/server.go:2859 +0xb7
net/http.ListenAndServe(...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/server.go:3115
main.main()
        /home/runner/work/factorio-server-manager/factorio-server-manager/src/main.go:34 +0x2ab

goroutine 6 [select]:
github.com/OpenFactorioServerManager/factorio-server-manager/api/websocket.(*wsHub).run(0xc000192230)
        /home/runner/work/factorio-server-manager/factorio-server-manager/src/api/websocket/wshub.go:117 +0x18c
created by github.com/OpenFactorioServerManager/factorio-server-manager/api/websocket.init.0
        /home/runner/work/factorio-server-manager/factorio-server-manager/src/api/websocket/wshub.go:102 +0x1af

goroutine 9 [select]:
database/sql.(*DB).connectionOpener(0xc0000fe480, 0xcb4d20, 0xc000145c80)
        /opt/hostedtoolcache/go/1.14.14/x64/src/database/sql/sql.go:1071 +0xe8
created by database/sql.OpenDB
        /opt/hostedtoolcache/go/1.14.14/x64/src/database/sql/sql.go:742 +0x12a

goroutine 16 [IO wait]:
internal/poll.runtime_pollWait(0x7fbce47b5c98, 0x72, 0xffffffffffffffff)
        /opt/hostedtoolcache/go/1.14.14/x64/src/runtime/netpoll.go:203 +0x55
internal/poll.(*pollDesc).wait(0xc000268718, 0x72, 0x0, 0x1, 0xffffffffffffffff)
        /opt/hostedtoolcache/go/1.14.14/x64/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc000268700, 0xc000237cc1, 0x1, 0x1, 0x0, 0x0, 0x0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/internal/poll/fd_unix.go:169 +0x19b
net.(*netFD).Read(0xc000268700, 0xc000237cc1, 0x1, 0x1, 0x0, 0x0, 0x0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/fd_unix.go:202 +0x4f
net.(*conn).Read(0xc0000114e8, 0xc000237cc1, 0x1, 0x1, 0x0, 0x0, 0x0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/net.go:184 +0x8e
net/http.(*connReader).backgroundRead(0xc000237cb0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/server.go:689 +0x58
created by net/http.(*connReader).startBackgroundRead
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/server.go:685 +0xd0

goroutine 12 [select]:
net/http.(*Transport).getConn(0x128d0e0, 0xc00027e6c0, 0x0, 0xbe78b6, 0x5, 0xc000211d00, 0x15, 0x0, 0x0, 0x0, ...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/transport.go:1350 +0x585
net/http.(*Transport).roundTrip(0x128d0e0, 0xc00018f200, 0xc0000cbb60, 0xc00027b458, 0x414968)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/transport.go:569 +0x76b
net/http.(*Transport).RoundTrip(0x128d0e0, 0xc00018f200, 0x128d0e0, 0x0, 0x0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/roundtrip.go:17 +0x35
net/http.send(0xc00018f200, 0xca7f60, 0x128d0e0, 0x0, 0x0, 0x0, 0xc000011880, 0x203000, 0x1, 0x0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/client.go:252 +0x43e
net/http.(*Client).send(0x12d5000, 0xc00018f200, 0x0, 0x0, 0x0, 0xc000011880, 0x0, 0x1, 0xc00018f200)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/client.go:176 +0xfa
net/http.(*Client).do(0x12d5000, 0xc00018f200, 0x0, 0x0, 0x0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/client.go:699 +0x44a
net/http.(*Client).Do(...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/client.go:567
github.com/OpenFactorioServerManager/factorio-server-manager/factorio.ModPortalList(0x0, 0x0, 0x0, 0x0, 0x0)
        /home/runner/work/factorio-server-manager/factorio-server-manager/src/factorio/mod_portal.go:38 +0xd7
github.com/OpenFactorioServerManager/factorio-server-manager/api.ModPortalListModsHandler(0xcb2ea0, 0xc0000cc700, 0xc00018f100)
        /home/runner/work/factorio-server-manager/factorio-server-manager/src/api/mod_portal_handler.go:23 +0x15b
net/http.HandlerFunc.ServeHTTP(0xbf68f0, 0xcb2ea0, 0xc0000cc700, 0xc00018f100)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/server.go:2041 +0x44
github.com/OpenFactorioServerManager/factorio-server-manager/api.AuthMiddleware.func1(0xcb2ea0, 0xc0000cc700, 0xc00018f100)
        /home/runner/work/factorio-server-manager/factorio-server-manager/src/api/auth.go:229 +0x286
net/http.HandlerFunc.ServeHTTP(0xc000263260, 0xcb2ea0, 0xc0000cc700, 0xc00018f100)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/server.go:2041 +0x44
github.com/gorilla/mux.(*Router).ServeHTTP(0xc0000fe540, 0xcb2ea0, 0xc0000cc700, 0xc00018e900)
        /home/runner/go/pkg/mod/github.com/gorilla/mux@v1.7.3/mux.go:212 +0xe2
net/http.serverHandler.ServeHTTP(0xc0000cc2a0, 0xcb2ea0, 0xc0000cc700, 0xc00018e900)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/server.go:2836 +0xa3
net/http.(*conn).serve(0xc0002437c0, 0xcb4d20, 0xc0001cbc00)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/server.go:1924 +0x86c
created by net/http.(*Server).Serve
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/server.go:2962 +0x35c

goroutine 18 [select]:
net.(*Resolver).lookupIPAddr(0x12d4860, 0xcb4da0, 0xc00027c960, 0xbceb5c, 0x3, 0xc000211d00, 0x11, 0x1bb, 0x0, 0x0, ...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/lookup.go:274 +0x664
net.(*Resolver).internetAddrList(0x12d4860, 0xcb4da0, 0xc00027c960, 0xbceb5c, 0x3, 0xc000211d00, 0x15, 0x0, 0x0, 0x0, ...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/ipsock.go:280 +0x4da
net.(*Resolver).resolveAddrList(0x12d4860, 0xcb4da0, 0xc00027c960, 0xbcede7, 0x4, 0xbceb5c, 0x3, 0xc000211d00, 0x15, 0x0, ...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/dial.go:222 +0x49e
net.(*Dialer).DialContext(0xc000066240, 0xcb4d60, 0xc000022098, 0xbceb5c, 0x3, 0xc000211d00, 0x15, 0x0, 0x0, 0x0, ...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/dial.go:404 +0x22a
net/http.(*Transport).dial(0x128d0e0, 0xcb4d60, 0xc000022098, 0xbceb5c, 0x3, 0xc000211d00, 0x15, 0x0, 0xc000237e30, 0xca8f60, ...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/transport.go:1144 +0x1f5
net/http.(*Transport).dialConn(0x128d0e0, 0xcb4d60, 0xc000022098, 0x0, 0xbe78b6, 0x5, 0xc000211d00, 0x15, 0x0, 0xc0001c1320, ...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/transport.go:1578 +0x19ee
net/http.(*Transport).dialConnFor(0x128d0e0, 0xc00009f6b0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/transport.go:1424 +0xc6
created by net/http.(*Transport).queueForDial
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/http/transport.go:1393 +0x3fe

goroutine 19 [select]:
net.cgoLookupIP(0xcb4d20, 0xc00027e740, 0xbceb5c, 0x3, 0xc000211d00, 0x11, 0x7fbce6123600, 0x20300000000000, 0x7fbce630ffff, 0x400, ...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/cgo_unix.go:229 +0x195
net.(*Resolver).lookupIP(0x12d4860, 0xcb4d20, 0xc00027e740, 0xbceb5c, 0x3, 0xc000211d00, 0x11, 0xc00003ee28, 0x4d73a2, 0xc00027e350, ...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/lookup_unix.go:96 +0x187
net.glob..func1(0xcb4d20, 0xc00027e740, 0xc000235720, 0xbceb5c, 0x3, 0xc000211d00, 0x11, 0xc000022098, 0x0, 0xbe78b6, ...)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/hook.go:23 +0x72
net.(*Resolver).lookupIPAddr.func1(0x0, 0x0, 0x0, 0x0)
        /opt/hostedtoolcache/go/1.14.14/x64/src/net/lookup.go:268 +0xb9
internal/singleflight.(*Group).doCall(0x12d4870, 0xc00026c370, 0xc000211d20, 0x15, 0xc00027e780)
        /opt/hostedtoolcache/go/1.14.14/x64/src/internal/singleflight/singleflight.go:95 +0x2e
created by internal/singleflight.(*Group).DoChan
        /opt/hostedtoolcache/go/1.14.14/x64/src/internal/singleflight/singleflight.go:88 +0x2bc
knoxfighter commented 3 years ago

@gaultx Is that reproducable?

If yes, we have a problem and it will take a while to find and fix this problem. Error [signal SIGSEGV: segmentation violation code=0x1 addr=0x15 pc=0x7fbce46e80c4] should not be possible in GO. It can only be genarated in the CGO code used by sqlite3.

gaultx commented 3 years ago

It may have to do with this mod: https://mods.factorio.com/mod/Todo-List

The only thing I can think of is that the display name of the mod in the UI has a unicode/emoji in it. Try uploading the mod and see what happens.

EDIT: Nope, failing with other mods too. But still consider using that mod as a test case.

knoxfighter commented 3 years ago

Reproducable, this error does not occure on Ubuntu 18.04LTS.

knoxfighter commented 3 years ago

Update: Using an executable comipled on Ubuntu 20.10 is fixing this. My theory: The glibc version installed on Ubuntu 20.10 (2.32) is not compatible with the one it got compiled with (Ubuntu 18.04 (2.27). Further testing has to be done.

Omaha2002 commented 3 years ago

can confirm on 20.04 with the latest docker version of OFSM. On another machine with 20.04 (identical setup) I don't have this crash. I installed and configured this machine a few months ago and doesnt have the sqlite.db yet but the auth.leveldb/ dir.

For me it's when i fill in the credentials on the mods page I get a:

image

which made me think my credentials were wrong but they do work on the other server.

When I fill in the Token from profile page on Factorio site in serversettings and go to the mods page and do a refresh the page crashes: 404 page not found.

Base factorio server when started is working fine.

knoxfighter commented 3 years ago

Seems like our executable is using one function, that got removed from glibc with version 3.32. https://abi-laboratory.pro/index.php?view=compat_report&l=glibc&v1=2.31&v2=2.32&obj=04aa0&kind=abi#Removed pthread_sigmask ( int how, sigset_t const* newmask, sigset_t* oldmask ) @@ GLIBC_2.2.5

objdump -T /home/knox/Desktop/Programme/GO/factorio-server-manager/manager/factorio_server_manager_1_1 | grep pthread
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_create
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_sigmask

I have no idea how to handle such a case, i never knew that things will get removed from glibc :( So if anybody knows how to work with this, tell me :)

knoxfighter commented 3 years ago

New compiled version on Ubuntu 20.10 is using pthread_sigmask @GLIBC 2.32

objdump -T run_manager | grep pthread
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_create
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_detach
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.3.2 pthread_cond_broadcast
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.3.2 pthread_cond_wait
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_mutexattr_destroy
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_mutex_destroy
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_mutexattr_init
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_attr_init
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_attr_getstacksize
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_mutex_unlock
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_mutexattr_settype
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.32  pthread_sigmask
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_join
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_attr_destroy
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_mutex_init
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_mutex_lock
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 pthread_mutex_trylock
0000000000947a00 g    DF .text  00000000000000b2  Base        _cgo_try_pthread_create
knoxfighter commented 3 years ago

In theory it should be automatically handled by the linker: https://developers.redhat.com/blog/2019/08/01/how-the-gnu-c-library-handles-backward-compatibility/

readelf --dyn-syms -W /lib/x86_64-linux-gnu/libc-2.32.so | grep pthread_sig
  1448: 0000000000091ad0   236 FUNC    GLOBAL DEFAULT   16 pthread_sigmask@@GLIBC_2.32
  1453: 0000000000091ad0   236 FUNC    GLOBAL DEFAULT   16 pthread_sigmask@GLIBC_2.2.5

I suspect a problem, that we force a static linking of CGO.

/tmp/go-link-017964132/000027.o: In function `unixDlOpen':
/home/runner/go/pkg/mod/github.com/mattn/go-sqlite3@v1.14.5/sqlite3-binding.c:39981: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/tmp/go-link-017964132/000004.o: In function `_cgo_26061493d47f_C2func_getaddrinfo':
/tmp/go-build/cgo-gcc-prolog:58: warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
Omaha2002 commented 3 years ago

So if i understand well installing on 18.04 would work? But 20.04 and 20.10 have the wrong glibc version?

knoxfighter commented 3 years ago

20.04 can still work but it is not save that it works. Some 20.04 installation already have the new glibc, some have the old. When you encounter this problem, you can compile it yourself. I uploaded a build, created on Ubuntu 20.10 to this comment, so you can use this one for now. (i will not create a docker build with it, sorry) factorio-server-manager-linux.zip

mroote commented 3 years ago

If we make a build with Ubuntu 20.04 and the new version of glibc will it be compatible with older versions as well? If so we should update the CI jobs to use Ubuntu 20.04 instead of 18.04.

Another option might be to use the Dockerfile-build container to generate the build artifacts, I'd have to test if the binary's generated in that container would work with Ubuntu however.

We can also try removing the static linking from the build commands. It may no longer be needed but I'd have to do some further testing there as well.

knoxfighter commented 3 years ago

If we make a build with Ubuntu 20.04 and the new version of glibc will it be compatible with older versions as well?

No, glibc is designed to be backwards compatible not forwards compatible. So Ubuntu 20.10 should be able able to runt he genarated files from 18.04.
Unfortunatly i did not had time to look into that issue. I think compiling fully dynamic could fix this. But that needs a lot of testing on multiple linux versions.

mroote commented 3 years ago

Yeah removing the static linking seems like the way to go if everything works properly across the OS's.

knoxfighter commented 3 years ago

Seems like that is fixed with linking dynamic. See #263 for more information and please test it :)

Omaha2002 commented 3 years ago

Also the docker version?

knoxfighter commented 3 years ago

As said in the PR: Test it with the docker image ofsm/ofsm:fix-glibc

Omaha2002 commented 3 years ago

Installed with ofsm/ofsm:fix-glibc can confirm it's working on 20.04:

image

image

downloading Krastorio 2 after succesful login:

image

image

mroote commented 3 years ago

Thanks for testing @Omaha2002

knoxfighter commented 3 years ago

I merged in the PR. Now develop will also work.

Omaha2002 commented 3 years ago

@knoxfighter mods working but I can't start the server anymore because I can't create the first save file. Did a clean docker install with develop branch.

Can't create a savefile, it says:

image

logs:

docker-compose logs

image

Omaha2002 commented 3 years ago

I installed the latest docker version ofsm/ofsm:latest. with the commits for fix-glibc and Ubuntu docker. It is possible to create a "first" save file and the server starts. The mods page still has the same problem, can't login, seems as if the fix-glibc commit didn't make it?

ofsm/ofsm:use-ubuntu-docker did work, both save file and mods downloading.

knoxfighter commented 3 years ago

Are you sure, you used the latest image? When you had it already downloaded to need to manually run docker pull ofsm/ofsm:latest.

Omaha2002 commented 3 years ago

Well, I was sure i did but nonetheless did a reinstall and now it works. docker ps showed ofsm:latest apparently it wasn't running latest.

Sorry for the confusion, all is working now, will test with a game this weekend.

Thanks for all the time and effort, much appreciated!

mroote commented 3 years ago

Thanks for helping out with this one @Omaha2002!