kyechou / leagueoflegends

League of Legends install and launch wrapper for Linux
https://www.leagueoflegends.com
GNU General Public License v3.0
290 stars 23 forks source link

Cannot get into a game #83

Closed Gruetzig closed 1 year ago

Gruetzig commented 1 year ago

System information

If your issue is related to graphics display, please attach the output of vulkaninfo --summary. I do not know if it is, but I included vulkaninfo --summary in my pastebin.

Describe the issue

When starting the game, not the launcher, it immediately crashes with an error I attached below, tried both ranked(bruh) and practice tool.

Reinstalled twice with no luck, worked just fine yesterday, never used the lutris version.

Steps to reproduce

https://pastebin.com/BS7yBiru pastebin with vulkaninfo --summary and leagueoflegends -v start

  1. Run command leagueoflegends -v start
  2. Go into a game
  3. See critical error

Screenshots

image

mkgiga commented 1 year ago

i just realized something and i am fairly certain now, since people are crashing -after- some time with an access violation in ram, it is most likely that some function in a dll is attempting to write to a buffer past the maximum buffer length in a 32 bit array of some sort. i remember vaguely some people having a message that showed some value slightly past the maximum 32bit integer limit

kyechou commented 1 year ago

Updated log: https://pastebin.com/65kwU7sn Explanation:

I've narrowed down the cause to this:

kyechou commented 1 year ago

Unfortunately I cannot continue working on this issue for a few weeks. For debugging and testing purposes, I've pushed my changes to this branch. To print out the page fault debug messages like the ones in the log file, after running patches/protonprep-LoL.sh, you need to manually apply the debug.patch.

mkgiga commented 1 year ago

Additional helpful information discovered: The time of crash is dependent on CPU clock frequency.

Supporting evidence:

Further speculation:

The data race theory some of you guys had has to be true at this point. I also noticed that the address 0x1700608de in error messages in kyechou's logs point to a binary value of just barely higher than 32 bits (it would require at least 33 bits to actually store), so I am 100% convinced at this point that some 32-bit data buffer is being written to as if it's a 64-bit data structure, which, I mean, makes sense considering this crash is happening on a patch where 32-bit systems got deprecated.

If anyone here is familiar with the WINAPI and wine debugging process + willing to hold my hand a little, I'd love to sign away most of my free time to help fix this problem

ghost commented 1 year ago

Hi I am crawling into wine pull requests about this bug and i found this pull request makes a lot of sense getting access violation. Can you try with this patch? Thanks.

kyechou commented 1 year ago

@mk-giga Thanks for the observation! But there are a few things that don't seem right.

The data race theory some of you guys had has to be true at this point.

If the root cause is related to a data race, I doubt the crash timing will be this deterministic. Typically the timing should be fairly unpredictable (especially on the same machine). IMHO the predictable timing behavior that you observed further indicates that ~the faster CPU got to the problematic instruction faster where it triggered the page faults twice as fast and also overran the stack twice as fast.~ (edit: I misread the numbers, but the argument's still the same. The faster CPU somehow got more instruction cycles done before the stack got overflowed.) So at the moment I don't think the current issue involves much about race conditions.

I also noticed that the address 0x1700608de in error messages in kyechou's logs point to a binary value of just barely higher than 32 bits (it would require at least 33 bits to actually store)

If you look closely in the log, there are a lot of other addresses where the exception occurred that also require 33 bits (i.e., 64-bit locations). But they didn't seem to cause any trouble and simply triggered a one-time page fault, which is very normal for a program to do. So if the problem was a 32-bit data buffer being written, it doesn't make sense that a ton of other similar page faults (at different code locations) that worked just fine.

Another thing is, if you take a look at the wine source code where the additional debugging output code was added. In that function there's this line rec.ExceptionAddress = (void *)RIP_sig(ucontext);, which means the addresses in the log were the rip values at the time of (right before) the page fault exceptions. So it's definitely pointing to a code segment (i.e., text section) rather than a data buffer.

If anyone here is familiar with the WINAPI and wine debugging process + willing to hold my hand a little, I'd love to sign away most of my free time to help fix this problem

I'd suggest taking a look at eggroll's wine-ge-custom repository, getting familiar with applying the patches and compiling the wine code, see how to reproduce the problems, and then you can look deeper into what's happening in the wine code, try tweaking a few things and see what happens.

kyechou commented 1 year ago

Hi I am crawling into wine pull requests about this bug and i found this pull request makes a lot of sense getting access violation. Can you try with this patch? Thanks.

@EndChapter Thanks! That's a good find! It does seem likely to be related. I'll test it in the background when I get a chance, but no ETA guaranteed.

kyechou commented 1 year ago

@EndChapter Unfortunately there's the same error after applying the patch. I guess the next step would be to somehow attach winedbg to the address where the strange page faults happened and see what the instruction actually tried to do. Namely, figure out what/where the instruction at the exception address tried to read (or load) since it's a read fault.

fvalasiad commented 1 year ago

So, basically, there is a stack overflow error, in rito's code, it happens both in linux & windows, but for some reason in linux using wine it happens instantly, while windows users experience it after playing for a certain amount of time.

Adding debug flags to wine's runtime, effectively adding overhead to the program, makes the exception appear later in the course of the game, depending on your overall system(CPU, etc).

A race condition sounds like a good guess, since as it seems the error is highly dependent on the overall system used. But you said that you aren't sure about it, since the time it takes to throw the error is rather predictable, but i beg to differ.

The first time i entered a practice tool, it took 6 minutes to crash

the second, it took 2 minutes

third one crashed instantly

fourth one lasted 10 minutes.

Conclusion

If we've concluded that it's a fault on rito's part, what are we even trying to find here?

kyechou commented 1 year ago

If we've concluded that it's a fault on rito's part, what are we even trying to find here?

Because we don't know for sure if it's riot's fault or not. It's just speculation.

kyechou commented 1 year ago

It's good if we can confirm that of course.

noah1510 commented 1 year ago

@fvalasiad If the problem is a data race, it is not a data race between CPU threads. If that were the case limiting wine to one thread would fix it however doing that only make the client perform terrible (yes even worse than normal). The fact that it runs with debug options enabled, means that in addition to riots problems, there is a problem with wine.

fvalasiad commented 1 year ago

@noah1510 Wdym limiting wine to one thread, if the problem lies in rito's code's threads anything we do in wine doesn't change anything. Do you mean that we configure wine to run all of the applications threads in a single thread? So that no two threads run in parallel?

In that case the race condition won't disappear by any means, it will just depend entirely on the scheduler's decisions.

Adding overhead to the program by enabling debug flags makes everything run slower, and as a result, the statistical chance that a race condition happens just gets smaller, thus the unpredictability.

Can you elaborate why you think that wine is at fault(potentially) too?

noah1510 commented 1 year ago

If making the code slower is what fixes it, just reducing the core count and the cpu speed would be enough to fix it. However setting lol to use only one core in lutris still causes the game to instantly crash while setting the WINEDEBUG options seems to get it working. If slowing the code down would help, then reducing the CPU speed would make the code more stable, but the results by @mk-giga suggest the opposite.

Also reducing the core count also slows the execution down. I forces the scheduler to not execute code in parallel. Data race conditions usually happen because two parts of the code try to access some memory at the same time. This is not possible if they can not execute stuff at the same time because there is only one thread.

Granted these are all just theories of mine. I have no idea about both the linux and windows APIs, I only have quite a lot of experience programming multi threaded code in C++ and running that across operating system. From that I can tell you that using less threads will really limit the amount of data races and drastically reduce the chance of crashes.

fvalasiad commented 1 year ago

@noah1510 race conditions aren't only occurring when threads run in parallel, they can occur anyways depending on the task switches of the scheduler.

You are correct though that limiting the parallelism, should also limit the chance that a race condition happens. So I understand your suspicion.

I wouldn't trust that to pinpoint if wine is at fault tho, since task switching is rather complicated. Adding an extra printf; fflush(stdout) might as well make a race condition disappear(Given that it's extremely slow). Wine does just that when it prints debug logs to stderr(i assume).

Again, these are just predictions.

kyechou commented 1 year ago

Yes I agree that there are a lot things we don't know for sure yet. But we do know for a fact that, for those who have crashes, the crash (sooner or later) is caused by repeated page faults at a specific code location. This behavior is deterministic and reproducible. If we understand what the instruction is trying to do, we would know if it's fixable in Wine or if there's nothing we can do to circumvent that. I suggest if anyone has time, we should try to disassemble that part of code for a better understanding.

fvalasiad commented 1 year ago

@kyechou Are you suggesting we decompile the code to x86 assembly and try to debug it? ahahahha, isn't that against rito's terms of usage?

kyechou commented 1 year ago

@fvalasiad well.. i mean, what if we just attach the debugger while it's running? I may need to review their TOS though.

fvalasiad commented 1 year ago

@kyechou Tbh reading x86 assembly code that is the result of a compilation process(thus, even less purpose for it to be readable) is bad enough already, I'd rather not be reading machine code on top of that. Have we got any news regarding the windows side of things? Have they resolved the crashes there?

kyechou commented 1 year ago

Have we got any news regarding the windows side of things? Have they resolved the crashes there?

No idea...

[note to self] Winedbg disas: https://pastebin.com/tfxxzBgw Wine log: https://pastebin.com/2zHkFT6X

ghost commented 1 year ago

@kyechou do you mind joining the linux gaming dev discord channel? we have dxvk and wine developers there would be interesting for you to share your findings so maybe we can figure this out faster on upstream

feel free to share your findings in the # wine channel https://discord.gg/linuxgamingdev

as a warning, when discussing stuff about wine be sure to follow the wine clean room guidelines https://wiki.winehq.org/Clean_Room_Guidelines

for others, please dont join if your purpose is to get user support, is a proton/wine/dxvk/vkd3d-proton development server and is not meant for that.

moonshadow565 commented 1 year ago

Here is some info on 0008-ntdll-nopguard-call_vectored_handlers.patch: Originally i assumed the game hooks the function midway based on return address but after inspecting generated wine code pre and post patch i observed following:

  1. Current code generates 2 call sites for ret = func( &except_ptrs ) in call_vectored_handlers
  2. One of the call sites is only invoked on first run (i assume to initialize statics inside TRACE macro)
  3. The game then proceeds to save return address after func invocation
  4. On following invocations func returns from different call site
  5. The game probably then detects this as modification of code and crashes intentionally?

I verified this behavior by simulating different call sites from 2 identical functions at random. This is also why forcing tracing on seh made it past this check.

A cleaner patch that bunch of nops would probably be to force single call site by moving it into different function something like what is done with EXC_CallHandler in call_stack_handlers.

mewkl commented 1 year ago

Hello, I was one of those people that could not play anymore after minions spawned in practice tool no matter if I was using "trace+seh", or "fixme-all,trace+seh,trace+vulkan,trace+window" or neither, even with the "wine-lol-staging-8.5.1-x86_64-64bitfix" runner I downloaded from reddit.

After thinking a bit about race conditions I removed the WINEDEBUG environment variable and just set "restrict number of cores to" to "1" in lutris. (still interesting because it says core and not (logical) thread) Game went on a lot longer, but still crashed after a while, so I tried again this time with the large WINEDEBUG env var and I'm pretty sure I can play a full game now, from practice tool testing. My pc has 40 threads btw. Maybe some1 can share this knowledge on the reddit post? I have no reddit account so I'm just posting this here.

Update: If someone gets a glibc error like me on the lutris logs when switching to the newer runner that was compiled by somebody else online, you need to upgrade your distro to a newer one. (so basically if the downloaded binary doesn't seem to be working right click on the game @ lutris and check if that is your problem.

Update 2: By the way the game felt decently playable (I locked just it on 144fps since I play on 120hz). I have the client set to always close when a game runs though, and "use legacy dx9 mode on" not sure if that last one does anything on linux though. The client is ultra laggy but the game itself is surprisingly decent. I guess league itself isn't really very optimized for multiple threads? It's just a 2d game computationally I guess? My pc is a real toaster when it comes to core clocks and IPC, older intel server cpu, so if this works well for others as well you will lag less than me at least (=, if you don't have a real toaster.

Update3: I just crashed in a normal game after 20 minutes. Playable again after reconnecting. Update4: Apparently only for 4 minutes (=... Update5: Complete restart from lutris made me able to play the rest of the game. Update6: I will disable Esync and Fsync since I want to play single threaded anyway, and test if that helps me even more, since I don't trust lutris because it says it limited to a single core instead of a single thread. Hmm, I guess if I disable hyper-threading on my motherboard and it really does only use 1 core, then there can be no race conditions. I guess I'll try that first.

moonshadow565 commented 1 year ago

Here is some other tips for https://github.com/kyechou/wine-ge-custom/blob/388e786fcac9d79705c110996d41d1fd0ef3153a/patches/protonprep-LoL.sh#L36

Aside from that a pro tip is that you can test using a replay .rofl file or op.gg .bat file instead of going thru pains of launching client all the time. Replay files are saved under Documents/League\ of\ Legends folders. You can launch a replay or spectate stream like this

bin/wine cmd.exe /C 'start "" lol.exe EUW1-xxxxx.rofl'

Note: op.gg .bat files use start command to launch spectator streams. There is some difference between how start handles launching a process and how bare bones wine command launches a process. I found out start to be necessary to launch the game properly for both rofl files and spectator stream (failing another check otherwise i assume).

mewkl commented 1 year ago

Hello guys, I just played an entire game without crashing with WINEDEBUG unset and Output debugging info disabled on the lutris runner options. All I did was disable hyperthreading on my bios. I am still using wine-lol-staging-8.5.1-x86_64-64bitfix and "Restrict number of cores = 1" with lutris.
The only downside is a slooow client/champ select. Game itself runs well on my ancient machine.

tridoxx commented 1 year ago

check this maybe help for me work, and more user onf the discord proton-ge https://github.com/tridoxx/lol-on-linux-2023/blob/main/README.md

kyechou commented 1 year ago

Thanks @moonshadow565! I think what you suggested seem to fix the game exe crashing problem. I'll fix the patches and test more thoroughly.

TheFeelTrain commented 1 year ago

I think Riot may have pushed a hotfix, mine is suddenly working without me changing anything it might not be anything you did @kyechou

EDIT: There was definitely an update, the version of League of Legends.exe that was crashing had an m5sum of 2ec1ac75186cef6b106685a7883467de and now it is 2587fa200f18d94fcaff73bf5e702709

fvalasiad commented 1 year ago

I will test as well, gimme 2 mins

fvalasiad commented 1 year ago

I've now opened a practice tool and letting it sit afk hitting baron, will update in 20 minutes if it crashed or not. it would previously crash at less than 10 minutes.

fvalasiad commented 1 year ago

Game without WINEDEBUG crashes instantly, now testing with trace+seh

game with trace+seh crashes instantly, now testing with fixme-all,trace+seh,trace+vulkan,trace+window

game with fixme-all,trace+seh,trace+vulkan,trace+window running for 10 minutes at this point, will keep updating up until 30 minutes

TheFeelTrain commented 1 year ago

I should have mentioned I have it working using the wine-lol-staging package from the AUR. Going back and trying wine-lol-7.0-5 on the same prefix the riot client doesn't even open.

kyechou commented 1 year ago

@TheFeelTrain Yes I can confirm that it's working after I reverted back some patches.

tridoxx commented 1 year ago

Game without WINEDEBUG crashes instantly, now testing with trace+seh

game with trace+seh crashes instantly, now testing with fixme-all,trace+seh,trace+vulkan,trace+window

use this wine version https://github.com/polkaulfield/lol-for-linux-installer/releases/download/wine-lol-staging/wine-lol-staging-8.5-1.tar.xz

unzip it where are runners installed, for debian based is /home//.local/share/lutris/runners/wine

fvalasiad commented 1 year ago

Game without WINEDEBUG crashes instantly, now testing with trace+seh game with trace+seh crashes instantly, now testing with fixme-all,trace+seh,trace+vulkan,trace+window

use this wine version https://github.com/polkaulfield/lol-for-linux-installer/releases/download/wine-lol-staging/wine-lol-staging-8.5-1.tar.xz

unzip it where are runners installed, for debian based is /home//.local/share/lutris/runners/wine

Hmmm is this wine staging with GE'S patches applied?

kyechou commented 1 year ago

It is. I think it uses a binary package built from the AUR package.

PalanixYT commented 1 year ago

It seems as though Riot Games has finally published a new patch? I'm on leagueoflegends-git and using the precompiled binary by kyechou and I was able to play a match perfectly, without any crashes

kyechou commented 1 year ago

Thanks Riot for this one. I'll test a few older 32-bit patches (to make sure that no remaining 32-bit components still need them before removing them), and then update the public-facing stuff.

fvalasiad commented 1 year ago

I am at 17 minutes using WINEDEBUG but i am definitely trying the one you mentioned guys too, it looks good, guess league is back on linux

PalanixYT commented 1 year ago

Yup, I've tried multiple Practice Tools rounds and just finished a 45 minute game without any issues

hurutparittya commented 1 year ago

Can confirm, it seems to be working. Pog.

fvalasiad commented 1 year ago

Someone bother to notify the subreddit, so that mods post the solution, share the news with everyone else!

fvalasiad commented 1 year ago

Yeah GE's wine with WINEDEBUG now runs perfectly fine, i will now disable WINEDEBUG and try the other binary.

Not even riot client boots with the other binary, interesting

ghost commented 1 year ago

That means only this bug remains upstream: https://bugs.winehq.org/show_bug.cgi?id=54793

Please if anyone is interested at fixing this one on upstream wine you are more than welcome

mateus-filho commented 1 year ago

we have a solution guys? i lost

polkaulfield commented 1 year ago

@kyechou

It is. I think it uses a binary package built from the AUR package.

It's just the binaries made with moonshadow's PKGBUILD All credits to him. I just extracted the Arch package and compressed the binaries. https://aur.archlinux.org/packages/wine-lol-staging

I uploaded it to work with https://github.com/polkaulfield/lol-for-linux-installer

It's just https://github.com/kassindornelles/lol-for-linux-installer

With the runner changed. Kudos to moonshadow and kassidornelles :)

polkaulfield commented 1 year ago

Also with moonshadow's build it seems that the seh debug flag is not necessary, thats why i removed it from the script.

kyechou commented 1 year ago

Closing the issue now for being fixed by Riot and @moonshadow565.

Please be patient for a release on the fix. It takes time to compile/test to weed out some unneeded patches and there are some proton performance-related patches that were originally applied in the old GE 7.0.5 that may be worthwhile to rebase to 8.5. I'll send a PR to GE and M-Reimer to update the wine builds once it's done. In the meantime, please feel free to scroll up and try out other builds/installers.

For those interested in working on upstreaming the right fixes to mainstream wine, please check out https://bugs.winehq.org/show_bug.cgi?id=54793

ghost commented 1 year ago

@kyechou

It is. I think it uses a binary package built from the AUR package.

It's just the binaries made with moonshadow's PKGBUILD All credits to him. I just extracted the Arch package and compressed the binaries. https://aur.archlinux.org/packages/wine-lol-staging

I uploaded it to work with https://github.com/polkaulfield/lol-for-linux-installer

It's just https://github.com/kassindornelles/lol-for-linux-installer

With the runner changed. Kudos to moonshadow and kassidornelles :)

do you plan in maintaining your fork of my installer? i'm not sure if i want to come back to it anytime soon so would be nice to have someone to pass the torch

kyechou commented 1 year ago

@polkaulfield