Closed def- closed 3 years ago
@trml I'm afraid this might be caused by https://github.com/ddnet/ddnet/pull/4236
On the other hand we also had other crashes recently before merging that, but I can't see if they were the same
It seems like it might be related to switch. Will try to reproduce it.
Since it happened in the server it hopefully excludes most of the changes in the PR, since they were in the client, but I'm also not immediately sure which of the server changes could have affected it (my pr or other recent ones), since I believe most were in the snapping.
It seems to have happened after a player left the game, got a log file now:
[2021-10-24 09:40:49][chat-command]: 6 used /pause
[2021-10-24 09:40:53][chat-command]: 6 used /pause
[2021-10-24 09:40:54][chat-command]: 6 used /pause
[2021-10-24 09:40:56][game]: kill killer='4:<E8><B1><86><E8><85><90><E4><BD><AC>' victim='4:<E8><B1><86><E8><85><90>
<E4><BD><AC>' weapon=-2 special=0
[2021-10-24 09:40:56][game]: kill killer='4:<E8><B1><86><E8><85><90><E4><BD><AC>' victim='4:<E8><B1><86><E8><85><90>
<E4><BD><AC>' weapon=-2 special=0
[2021-10-24 09:40:56][game]: kill killer='5:[D] <E8><B1><86><E8><85><90><E4><BD><AC>' victim='5:[D] <E8><B1><86><E8>
<85><90><E4><BD><AC>' weapon=-2 special=0
[2021-10-24 09:41:08][game]: kill killer='4:<E8><B1><86><E8><85><90><E4><BD><AC>' victim='4:<E8><B1><86><E8><85><90>
<E4><BD><AC>' weapon=-2 special=0
[2021-10-24 09:41:08][game]: kill killer='4:<E8><B1><86><E8><85><90><E4><BD><AC>' victim='4:<E8><B1><86><E8><85><90>
<E4><BD><AC>' weapon=-2 special=0
[2021-10-24 09:41:08][game]: kill killer='5:[D] <E8><B1><86><E8><85><90><E4><BD><AC>' victim='5:[D] <E8><B1><86><E8>
<85><90><E4><BD><AC>' weapon=-2 special=0
[2021-10-24 09:41:19][chat]: 2:-2:<E9><98><BF><E5><B7><B4><E9><98><BF><E5><B7><B4>: <E6><92><A4><E9><80><80>
[2021-10-24 09:41:21][chat-command]: 6 used /pause
[2021-10-24 09:41:21][chat-command]: 6 used /pause
[2021-10-24 09:41:22][server]: client dropped. cid=3 addr=<{X}> reason=''
[2021-10-24 09:41:22][game]: kill killer='3:[D] <E9><98><BF><E5><B7><B4><E9><98><BF>' victim='3:[D] <E9><98><BF><E5>
<B7><B4><E9><98><BF>' weapon=-3 special=0
[2021-10-24 09:41:22][chat]: *** '[D] <E9><98><BF><E5><B7><B4><E9><98><BF>' has left the game
[2021-10-24 09:41:22][game]: leave player='3:[D] <E9><98><BF><E5><B7><B4><E9><98><BF>'
These crashes are apparently happening quite often now.
My guess is that the CCharacter got deleted, but we are still running HandleTiles in it somehow.
I'll try putting the official servers back to older code state to see if it helps, just before merging in this PR #4236 (that's the only change).
Even without PR #4236 this still happens, so that's probably not it.
I'm checking GER1 with asan, maybe can catch something like that.
=================================================================
==3322017==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000030 (pc 0x00000056b2b1 bp 0x7ffc4cb0e7d0 sp 0x7ffc4cb0e1e0 T0)
==3322017==The signal is caused by a READ memory access.
==3322017==Hint: address points to the zero page.
#0 0x56b2b1 in CGameContext::Collision() /home/teeworlds/src/master/src/game/server/gamecontext.h:137:36
#1 0x56b2b1 in CCharacter::HandleTiles(int) /home/teeworlds/src/master/src/game/server/entities/character.cpp:1719:19
#2 0x55f42d in CCharacter::DDRacePostCoreTick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:2152:4
#3 0x55d497 in CCharacter::Tick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:803:2
#4 0x5e1acf in CGameWorld::Tick() /home/teeworlds/src/master/src/game/server/gameworld.cpp:260:11
#5 0x59ea8a in CGameContext::OnTick() /home/teeworlds/src/master/src/game/server/gamecontext.cpp:810:10
#6 0x5295a2 in CServer::Run() /home/teeworlds/src/master/src/engine/server/server.cpp:2625:19
#7 0x535960 in main /home/teeworlds/src/master/src/engine/server/server.cpp:3636:21
#8 0x7f386c0fad09 in __libc_start_main csu/../csu/libc-start.c:308:16
#9 0x44ca69 in _start (/home/teeworlds/servers/DDRace64-Server_sql+0x44ca69)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/teeworlds/src/master/src/game/server/gamecontext.h:137:36 in CGameContext::Collision()
==3322017==ABORTING
I'm not sure how this is happening. Why is CGameContext corrupted in #0, but still fine in #5. Did we shut it down inbetween? All I found for a shutdown was:
if(Error)
{
dbg_msg("teehistorian", "error writing to file, err=%d", Error);
Server()->SetErrorShutdown("teehistorian io error");
}
But that should also not delete the CGameContext immediately.
How would we end up with an address 0x30?
guess we have to error read #4079 (+ #4200) or revert it and try if thats the cause, they changed most stuff in that classes
Do you still have the last stable commit? Before these crashes happened?
@Zwelf any ideas?
@Jupeyy no, unfortunately I didn't check the core files for a while since we didn't have crashes for many months before.
Hm, no idea really, I looked into the code again and couldn't find a possible way this crash could happen. Another possible suspicious commit I saw while looking is #4217 withe the removal of volatile before the ReentryGuard
variable, but I am not sure why it needed to be volatile.
@edg-l: could it be, that the ReentryGuard variables need to be atomic as well?
The pattern in the logs before the crash is often quite similar:
[2021-10-27 10:21:06][server]: client dropped. cid=9 addr=<{X}> reason=''
[2021-10-27 10:21:06][game]: kill killer='9:Maybe.' victim='9:Maybe.' weapon=-3 special=0
[2021-10-27 10:21:06][chat]: *** 'Maybe.' has left the game
[2021-10-27 10:21:06][game]: leave player='9:Maybe.'
[2021-10-27 10:21:06][game]: kill killer='10:[D] Maybe.' victim='10:[D] Maybe.' weapon=-1 special=0
[2021-10-27 10:21:06][game]: kill killer='10:[D] Maybe.' victim='10:[D] Maybe.' weapon=-2 special=0
and
[2021-10-27 07:27:28][server]: client dropped. cid=0 addr=<{X}> reason=''
[2021-10-27 07:27:28][game]: kill killer='0:엉덩이' victim='0:엉덩이' weapon=-3 special=0
[2021-10-27 07:27:28][chat]: *** '엉덩이' has left the game
[2021-10-27 07:27:28][game]: leave player='0:엉덩이'
[2021-10-27 07:27:28][game]: kill killer='1:ASS' victim='1:ASS' weapon=-1 special=0
[2021-10-27 07:27:28][game]: kill killer='1:ASS' victim='1:ASS' weapon=-2 special=0
So maybe it's killing in two different ways while another player is leaving?
ubsan:
/home/teeworlds/src/master/src/game/server/entities/character.cpp:2143:6: runtime error: member access within address 0x0000011cf5e0 which does not point to an object of type 'CCharacter'
0x0000011cf5e0: note: object has invalid vptr
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
invalid vptr
#0 0x50ec92 in CCharacter::DDRacePostCoreTick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:2143:6
#1 0x50aeb3 in CCharacter::Tick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:803:2
#2 0x5d8453 in CGameWorld::Tick() /home/teeworlds/src/master/src/game/server/gameworld.cpp:260:11
#3 0x577701 in CGameContext::OnTick() /home/teeworlds/src/master/src/game/server/gamecontext.cpp:810:10
#4 0x4acf38 in CServer::Run() /home/teeworlds/src/master/src/engine/server/server.cpp:2625:19
#5 0x4be872 in main /home/teeworlds/src/master/src/engine/server/server.cpp:3636:21
#6 0x7f1d942e5d09 in __libc_start_main csu/../csu/libc-start.c:308:16
#7 0x4529f9 in _start (/home/teeworlds/servers/DDRace64-Server_sql+0x4529f9)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/teeworlds/src/master/src/game/server/entities/character.cpp:2143:6 in
/home/teeworlds/src/master/src/game/server/entities/character.cpp:805:5: runtime error: member access within address 0x0000011cf5e0 which does not point to an object of type 'CCharacter'
0x0000011cf5e0: note: object has invalid vptr
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
invalid vptr
#0 0x50b5c9 in CCharacter::Tick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:805:5
#1 0x5d8453 in CGameWorld::Tick() /home/teeworlds/src/master/src/game/server/gameworld.cpp:260:11
#2 0x577701 in CGameContext::OnTick() /home/teeworlds/src/master/src/game/server/gamecontext.cpp:810:10
#3 0x4acf38 in CServer::Run() /home/teeworlds/src/master/src/engine/server/server.cpp:2625:19
#4 0x4be872 in main /home/teeworlds/src/master/src/engine/server/server.cpp:3636:21
#5 0x7f1d942e5d09 in __libc_start_main csu/../csu/libc-start.c:308:16
#6 0x4529f9 in _start (/home/teeworlds/servers/DDRace64-Server_sql+0x4529f9)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/teeworlds/src/master/src/game/server/entities/character.cpp:805:5 in
/home/teeworlds/src/master/src/game/server/entities/character.cpp:814:16: runtime error: member access within address 0x0000011cf5e0 which does not point to an object of type 'CCharacter'
0x0000011cf5e0: note: object has invalid vptr
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
invalid vptr
#0 0x50b5e9 in CCharacter::Tick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:814:16
#1 0x5d8453 in CGameWorld::Tick() /home/teeworlds/src/master/src/game/server/gameworld.cpp:260:11
#2 0x577701 in CGameContext::OnTick() /home/teeworlds/src/master/src/game/server/gamecontext.cpp:810:10
#3 0x4acf38 in CServer::Run() /home/teeworlds/src/master/src/engine/server/server.cpp:2625:19
#4 0x4be872 in main /home/teeworlds/src/master/src/engine/server/server.cpp:3636:21
#5 0x7f1d942e5d09 in __libc_start_main csu/../csu/libc-start.c:308:16
#6 0x4529f9 in _start (/home/teeworlds/servers/DDRace64-Server_sql+0x4529f9)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/teeworlds/src/master/src/game/server/entities/character.cpp:814:16 in
/home/teeworlds/src/master/src/game/server/entities/character.cpp:814:2: runtime error: member access within address 0x0000011cf5e0 which does not point to an object of type 'CCharacter'
0x0000011cf5e0: note: object has invalid vptr
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
invalid vptr
#0 0x50b5fb in CCharacter::Tick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:814:2
#1 0x5d8453 in CGameWorld::Tick() /home/teeworlds/src/master/src/game/server/gameworld.cpp:260:11
#2 0x577701 in CGameContext::OnTick() /home/teeworlds/src/master/src/game/server/gamecontext.cpp:810:10
#3 0x4acf38 in CServer::Run() /home/teeworlds/src/master/src/engine/server/server.cpp:2625:19
#4 0x4be872 in main /home/teeworlds/src/master/src/engine/server/server.cpp:3636:21
#5 0x7f1d942e5d09 in __libc_start_main csu/../csu/libc-start.c:308:16
#6 0x4529f9 in _start (/home/teeworlds/servers/DDRace64-Server_sql+0x4529f9)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/teeworlds/src/master/src/game/server/entities/character.cpp:814:2 in
/home/teeworlds/src/master/src/game/server/entities/character.cpp:816:14: runtime error: member access within address 0x0000011cf5e0 which does not point to an object of type 'CCharacter'
0x0000011cf5e0: note: object has invalid vptr
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
invalid vptr
#0 0x50b60d in CCharacter::Tick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:816:14
#1 0x5d8453 in CGameWorld::Tick() /home/teeworlds/src/master/src/game/server/gameworld.cpp:260:11
#2 0x577701 in CGameContext::OnTick() /home/teeworlds/src/master/src/game/server/gamecontext.cpp:810:10
#3 0x4acf38 in CServer::Run() /home/teeworlds/src/master/src/engine/server/server.cpp:2625:19
#4 0x4be872 in main /home/teeworlds/src/master/src/engine/server/server.cpp:3636:21
#5 0x7f1d942e5d09 in __libc_start_main csu/../csu/libc-start.c:308:16
#6 0x4529f9 in _start (/home/teeworlds/servers/DDRace64-Server_sql+0x4529f9)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/teeworlds/src/master/src/game/server/entities/character.cpp:816:14 in
/home/teeworlds/src/master/src/game/server/entities/character.cpp:816:2: runtime error: member access within address 0x0000011cf5e0 which does not point to an object of type 'CCharacter'
0x0000011cf5e0: note: object has invalid vptr
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^~~~~~~~~~~~~~~~~~~~~~~
invalid vptr
#0 0x50b61f in CCharacter::Tick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:816:2
#1 0x5d8453 in CGameWorld::Tick() /home/teeworlds/src/master/src/game/server/gameworld.cpp:260:11
#2 0x577701 in CGameContext::OnTick() /home/teeworlds/src/master/src/game/server/gamecontext.cpp:810:10
#3 0x4acf38 in CServer::Run() /home/teeworlds/src/master/src/engine/server/server.cpp:2625:19
#4 0x4be872 in main /home/teeworlds/src/master/src/engine/server/server.cpp:3636:21
#5 0x7f1d942e5d09 in __libc_start_main csu/../csu/libc-start.c:308:16
#6 0x4529f9 in _start (/home/teeworlds/servers/DDRace64-Server_sql+0x4529f9)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/teeworlds/src/master/src/game/server/entities/character.cpp:816:2 in
=================================================================
==1181945==ERROR: AddressSanitizer: heap-use-after-free on address 0x621000062d48 at pc 0x00000055f7ad bp 0x7ffddeadd7b0 sp 0x7ffddeadd7a8
READ of size 1 at 0x621000062d48 thread T0
#0 0x55f7ac in CCharacter::DDRacePostCoreTick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:2143:6
#1 0x55d467 in CCharacter::Tick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:803:2
#2 0x5e1b0f in CGameWorld::Tick() /home/teeworlds/src/master/src/game/server/gameworld.cpp:260:11
#3 0x59eaba in CGameContext::OnTick() /home/teeworlds/src/master/src/game/server/gamecontext.cpp:810:10
#4 0x5295a2 in CServer::Run() /home/teeworlds/src/master/src/engine/server/server.cpp:2625:19
#5 0x535960 in main /home/teeworlds/src/master/src/engine/server/server.cpp:3636:21
#6 0x7ffab5d2cd09 in __libc_start_main csu/../csu/libc-start.c:308:16
#7 0x44ca69 in _start (/home/teeworlds/servers/DDRace64-Server_sql+0x44ca69)
0x621000062d48 is located 72 bytes inside of 4576-byte region [0x621000062d00,0x621000063ee0)
freed by thread T0 here:
#0 0x4c685d in free (/home/teeworlds/servers/DDRace64-Server_sql+0x4c685d)
#1 0x577e10 in CCharacter::operator delete(void*) /home/teeworlds/src/master/src/game/server/entities/character.cpp:19:1
#2 0x577e10 in CCharacter::~CCharacter() /home/teeworlds/src/master/src/game/server/entities/character.h:25:7
#3 0x5eddfb in CPlayer::KillCharacter(int) /home/teeworlds/src/master/src/game/server/player.cpp:587:3
#4 0x62da5a in CGameTeams::KillTeam(int, int) /home/teeworlds/src/master/src/game/server/teams.cpp:452:34
#5 0x62da5a in CGameTeams::OnCharacterDeath(int, int) /home/teeworlds/src/master/src/game/server/teams.cpp:1009:4
#6 0x5625b1 in CCharacter::Die(int, int) /home/teeworlds/src/master/src/game/server/entities/character.cpp:984:11
#7 0x5674ee in CCharacter::HandleSkippableTiles(int) /home/teeworlds/src/master/src/game/server/player.h
#8 0x55ec79 in CCharacter::DDRacePostCoreTick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:2142:2
#9 0x55d467 in CCharacter::Tick() /home/teeworlds/src/master/src/game/server/entities/character.cpp:803:2
#10 0x5e1b0f in CGameWorld::Tick() /home/teeworlds/src/master/src/game/server/gameworld.cpp:260:11
#11 0x59eaba in CGameContext::OnTick() /home/teeworlds/src/master/src/game/server/gamecontext.cpp:810:10
#12 0x5295a2 in CServer::Run() /home/teeworlds/src/master/src/engine/server/server.cpp:2625:19
#13 0x535960 in main /home/teeworlds/src/master/src/engine/server/server.cpp:3636:21
#14 0x7ffab5d2cd09 in __libc_start_main csu/../csu/libc-start.c:308:16
previously allocated by thread T0 here:
#0 0x4c6add in malloc (/home/teeworlds/servers/DDRace64-Server_sql+0x4c6add)
#1 0x54f7ff in CCharacter::operator new(unsigned long, int) /home/teeworlds/src/master/src/game/server/entities/character.cpp:19:1
#2 0x5e953b in CPlayer::TryRespawn() /home/teeworlds/src/master/src/game/server/player.cpp:692:17
#3 0x5e646c in CPlayer::Tick() /home/teeworlds/src/master/src/game/server/player.cpp:252:4
#4 0x59ebe9 in CGameContext::OnTick() /home/teeworlds/src/master/src/game/server/gamecontext.cpp:822:20
#5 0x5295a2 in CServer::Run() /home/teeworlds/src/master/src/engine/server/server.cpp:2625:19
#6 0x535960 in main /home/teeworlds/src/master/src/engine/server/server.cpp:3636:21
#7 0x7ffab5d2cd09 in __libc_start_main csu/../csu/libc-start.c:308:16
SUMMARY: AddressSanitizer: heap-use-after-free /home/teeworlds/src/master/src/game/server/entities/character.cpp:2143:6 in CCharacter::DDRacePostCoreTick()
Shadow bytes around the buggy address:
0x0c4280004550: fd fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa
0x0c4280004560: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c4280004570: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c4280004580: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c4280004590: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c42800045a0: fd fd fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd
0x0c42800045b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c42800045c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c42800045d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c42800045e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c42800045f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07¬
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==1181945==ABORTING
Lagar reports it's a specific kill spike on the map Short, he can reproduce it.
wow this bug must exist since like for ever
I guess usually the m_alive is still correctly set to 0, and thus we quit quickly enough. But sometimes you're unlucky, some new object is there and it's != 0, then you start accessing random data as pointers.
https://github.com/ddnet/ddnet/commit/d6c344853a4904c27bea27187a50b086551b9050
This probs if(i != ClientID)
so atleast the same character isnt deleted
@heinrich5991 wanna apply the fix?
Oh wow, there is no way in hell I'd have caught this by code inspection. I went over this stuff like 50 times now
Aha, I knew it had to be a kill within Tick. I just missed the fact that a tee could be killed by another tee, I was looking for a self kill