anyc / steam-overlay

Gentoo overlay for Valve's Steam client and Steam-based games
GNU General Public License v2.0
202 stars 44 forks source link

Rogue Company EOS EAC does not work #313

Open ryao opened 2 years ago

ryao commented 2 years ago

I have been lobbying the developers of Rogue Company to turn on Proton EAC support for months, and a few weeks ago, they finally did, but it was broken on Gentoo.

If you patch your kernel with the following, recompile bubblewrap without USE=-suid and run it using the Steam flatpak, it works;

https://salsa.debian.org/kernel-team/linux/-/blob/debian/5.16.14-1/debian/patches/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch

I was unhappy with this, so I started debugging. So far, I have not identified the root cause, but I have learned that setting up a Ubuntu 20.04 chroot on Gentoo allows it to work (while a Ubuntu 22.04 chroot does not allow it to work).

Here are some very rough notes on how to do the Ubuntu chroot:

# As a regular user:

xhost +

# Then as root:

# If you do not have debootstrap installed:
emerge debootstrap

# Make a mountpoint for the "chroot":
zfs create -o mountpoint=/mnt/ubuntu rpool/ROOT/ubuntu

# Install the ubuntu base system:
debootstrap --include=vim,wget,gpg focal /mnt/ubuntu http://archive.ubuntu.com/ubuntu/

# Fix the sources list:
wget -O /mnt/ubuntu/etc/apt/sources.list https://gist.githubusercontent.com/ishad0w/2187a4eaab9273387645ac11905aca68/raw/ae20b6e9c8e987081d6c15b9085b549505ea85e8/sources.list

sed -i -e 's/jammy/focal/g' /mnt/ubuntu/etc/apt/sources.list

# Fix resolv.conf so DNS works:
cp {,/mnt/ubuntu}/etc/resolv.conf

# Allow the chroot to access pulse audio and x11 unix sockets so audio and graphics work:
mkdir -p /mnt/ubuntu/tmp/.X11-unix
ln -s /mnt/{host,ubuntu}/tmp/.X11-unix/X0
mkdir -p /mnt/ubuntu/run/user/1000/
ln -s /mnt/{host,ubuntu}/run/user/1000/pulse

# Using a mount namespace instead of a chroot for the chroot:
mkdir /mnt/ubuntu/mnt/host

sudo -i unshare -m
pivot_root /mnt/ubuntu /mnt/ubuntu/mnt/host
mount --bind {/mnt/host,}/proc
mount --bind {/mnt/host,}/sys
mount --bind {/mnt/host,}/dev/pts
mount --bind {/mnt/host,}/dev/shm
exec bash -i
cd

# Make user account
# Note that group numbers vary between Gentoo and Ubuntu, so the video group in Gentoo is really the sudo group in Ubuntu. This hack *might* not be necessary due to how I setup /dev above, but I have not tested to see if it is unnecessary.
useradd -m richard
usermod -a -G audio,video,input,sudo richard

# debootstrap did not add i386 support, which steam needs
dpkg --add-architecture i386

# Add steam repository
wget -O- http://repo.steampowered.com/steam/archive/stable/steam.gpg | sudo gpg --dearmor | sudo tee /usr/share/keyrings/steam.gpg

echo deb [arch=amd64 signed-by=/usr/share/keyrings/steam.gpg] http://repo.steampowered.com/steam/ stable steam | sudo tee /etc/apt/sources.list.d/steam.list

# Add CUDA repository as documented in https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb
dpkg -i cuda-keyring_1.0-1_all.deb

# Tell apt to sync its repositories
apt update

# Install needed stuff. Some of the listed items are redundant. The command needs to be cleaned up.
apt install libgl1-mesa-dri:amd64 libgl1-mesa-dri:i386 libgl1-mesa-glx:amd64 libgl1-mesa-glx:i386 steam-launcher nvidia-driver-515 libgl1:i386 libvulkan1 libvulkan1:i386 vulkan-tools libnvidia-gl-515:i386 libc6:amd64 libc6:i386 libegl1:amd64 libegl1:i386 libgbm1:amd64 libgbm1:i386 libgl1-mesa-dri:amd64 libgl1-mesa-dri:i386 libgl1:amd64 libgl1:i386 steam-libs-i386:i386 xdg-desktop-portal xdg-desktop-portal-gtk

# That update is going to clobber /etc/apt/sources.list.d/steam.list, and it will remove PGP key verification.
# Feel free to add it back.
# 
# signed-by=/usr/share/keyrings/steam.gpg

# Upgrade anything missed for good measure.
apt upgrade

# Running nvidia-smi as root is important so that the user can use the GPU
nvidia-smi

# Same for vulkaninfo
vulkaninfo

# Now we drop to a user account
su - richard
cp /mnt/host/home/richard/.pulse-cookie /home/richard/
export PULSE_SERVER="unix:/run/user/1000/pulse/native"
export DISPLAY=:0
steam

# Now enable steam play for all titles and restart steam. Afterward, you can either add your steam library as an alternative library path if you already have rogue company there or download rogue company inside the chroot.

A few caveats:

In any case, I have not narrowed down what the bug is, but this allows me to start upgrading things in the Ubuntu chroot until I find what causes the breakage with Gentoo. It is very strange that only this EOS EAC game breaks on Gentoo while all others with proton support work without a problem.

Anyway, I am at a point where I feel like I have something that I can share with others who might or might not have more time to work on this than I do.

chewi commented 2 years ago

I take it it's not just #309 then?

Curve commented 2 years ago

This issue occurs on Arch Linux too, the game working on ubuntu 20.04 leads me to believe it's caused by some recently updated package

Curve commented 2 years ago

I take it it's not just #309 then?

Kind of but the glibc mentioned in that issue should be in the repos now and using glibc 2.35 does not fix the problem

ryao commented 2 years ago

I take it it's not just #309 then?

That was the first thing I checked. The fix was already on my system at that point, so it is not that.

ryao commented 2 years ago

I started testing libc6 packages from newer Ubuntu versions in Ubuntu 20.04 until I found one that broke Rogue Company, here are the results:

This narrows it down to some change made in glibc between 2.33 and 2.34 in Ubuntu. Presumably this is an upstream change since Gentoo is also affected, but there could also be some distribution level change that both of distributions made the same time. In any case, this is progress. :)

ryao commented 2 years ago

Either I figure out how to git bisect glibc, or I can take a wild guess:

https://github.com/libhugetlbfs/libhugetlbfs/pull/63

glibc 2.34 removed the malloc hooks, which broke libhugetlbfs. What are the chances that Rogue Company's EAC relies on those?

I have a snapshot of my system from back when it ran glibc 2.33, so I can use that to make a new chroot and then start using glibc-9999 to bisect and test changes. I am debating whether it would be more fun to do this or try to backout the glibc change.

chewi commented 2 years ago

Git bisect is pretty easy. You can use it on the clone created by glibc-9999, i.e. git3-src/glibc.git. On each commit it suggests, you can pass this to Portage like so:

GIT_OVERRIDE_COMMIT_GLIBC=8ab8afb33677f51a8b4b1dab04147c9f44bc4bd5

It means you're rebuilding it from scratch each time, but that's probably safest with glibc, and it doesn't take long to build these days. Applying this to Proton without replacing your regular version is bit trickier, but there are various options.

Curve commented 2 years ago

Git bisect is pretty easy. You can use it on the clone created by glibc-9999, i.e. git3-src/glibc.git. On each commit it suggests, you can pass this to Portage like so:

GIT_OVERRIDE_COMMIT_GLIBC=8ab8afb33677f51a8b4b1dab04147c9f44bc4bd5

It means you're rebuilding it from scratch each time, but that's probably safest with glibc, and it doesn't take long to build these days. Applying this to Proton without replacing your regular version is bit trickier, but there are various options.

Just adding to this, while rebuilding each time may be tedious, using something like ccache maybe reduce the build time

ryao commented 2 years ago

@GloriousEggroll volunteered to do the bisect. He sent me this in discord:

https://src.fedoraproject.org/rpms/glibc/commits/f35?page=2

3af4a41 (7) working
2f71737 (12) working
6758377 (13) working
222c141 (14) working
fd5c07b (15) broken
0755882 (17) broken
b4f030a (36) broken

222c141 (14) working:
Add a conditional dependency for glibc-gconv-extra.i686 in x86_64
https://src.fedoraproject.org/rpms/glibc/c/222c141c852334d88daa9fa65c487466ec544f95?branch=f35

fd5c07b (15) broken:
Install shared objects under their ABI names, avoiding symlinks (#1652867)
https://src.fedoraproject.org/rpms/glibc/c/fd5c07ba69e580a16d1754e425530176a56b7982?branch=f35

glibc: Avoid the need for manually running ldconfig after downgrade
https://bugzilla.redhat.com/show_bug.cgi?id=1652867

> There is no glibc source code change between the two, so this is likely an issue with the symlink change.

He also said ah looks like the no symlink patches were applied in that broken commit before they were upstreamed and gave me this list of patches to revert:

nptl_db: Install libthread_db under a regular implementation name 86f0179bc003ffc34ffaa8d528a7a90153ac06c6
Makerules: Remove lib-version, $(subdir-version) b89d5de2508215ef3131db7bed76ac50b3f4c205
elf: Generalize name-based DSO recognition in ldconfig  6bf789d69e6be48419094ca98f064e00297a27d5
Install shared objects under their ABI names 8208be389bce84be0e1c35a3daa0c3467418f921

After reverting them and rebasing, I produced this patch:

https://dpaste.com/B6DNVVRV7

However, emerge will fail to install a patched glibc because this change was made to the ebuild to support the changes that are being reverted:

-       local newldso=$(find . -maxdepth 1 -name 'ld-*so' -type f -print -quit)
+       local newldso=$(find . -maxdepth 1 -name 'ld*so.?' -type f -print -quit)

Unfortunately, after merging glibc with the revert into my filesystem, Rogue Company's EAC is still unhappy. Perhaps there is yet another change in 2.35 that also makes it unhappy. I am not sure when I will find time to bisect myself, but this is possibly progress. People with glibc 2.34 could test this version of the patch:

http://dpaste.com/9PY7X8XAG

Note that:

  1. This change is untested against glibc 2.34. Have a way to rollback the change if something goes wrong (e.g. via ZFS snapshots through an initramfs environment).
  2. It also needs that ebuild change.

I use glibc 2.35 since I decided to upgrade to ~amd64 on this particular machine when trying to troubleshoot other issues. :/

Curve commented 2 years ago

Great research!

chewi commented 2 years ago

That was very nice of GloriousEggroll. I didn't know he worked for Red Hat. Are you saying that Fedora 35 is broken too?

PixsaOJ commented 2 years ago

Can anyone give instructions about using glibc 2.33 on Arch? Maybe we could provide the right version by ENV variables or something.

I imagine downloading older versions of glibc and linking to that.

ryao commented 2 years ago

That was very nice of GloriousEggroll. I didn't know he worked for Red Hat. Are you saying that Fedora 35 is broken too?

Fedora 35 uses glibc 2.34, so it is broken there too. In hindsight, he should have started doing a bisect with Fedora 34, since the bisect has the risk of breaking compatibility with binaries built against glibc 2.34.

That said, it was nice of him. I had been talking about this on his discord server, which caught his attention. I had not expected him to spend time on it until he volunteered to spend some time on it.

Can anyone give instructions about using glibc 2.33 on Arch? Maybe we could provide the right version by ENV variables or something.

I imagine downloading older versions of glibc and linking to that.

This issue tracker is for Gentoo users.

However, I can say that downgrading glibc on Arch is not possible without breaking the system. You should use the flatpak version of steam for the time being.

PixsaOJ commented 2 years ago

@ryao I am not saying downgrading. I'm saying providing binaries which could be downloaded elsewhere other than system. This solution would work on any Linux distro.

chewi commented 2 years ago

To clarify what @ryao meant, you can't use an older glibc with other libraries built against a newer glibc, e.g. your graphics drivers. You'd need an entire older environment.

ryao commented 2 years ago

@PixsaOJ That is the same as downgrading the system glibc, which the binaries will not like. You need to do a chroot with binaries that are built against an older glibc like I documented in the first comment.

@GloriousEggroll did some more bisecting and found the problem. 7a5db2e82fbb6c3a6e3fdae02b7166c5d0e8c7a8 also needed to be reverted.

He has an RPM with the reverts here:

https://download.copr.fedorainfracloud.org/results/gloriouseggroll/glibc-testing/fedora-34-x86_64/04499975-glibc/glibc-2.35-11.fc34.src.rpm

I have my own revert against glibc 2.35 here:

http://dpaste.com/FC5GVYB2K

The brave that wish to do the reverts themselves should revert:

I have tested this and it works (provided that you modify the glibc ebuild as explained above). However, I have not confirmed that all 5 patches must be reverted to make EAC happy (although this seems quite likely). I imagine that this is a bug for the toolchain project, although I will not file that right away. I want to do some more testing and clean up before filing a bug for them.

ryao commented 2 years ago

https://cdn.discordapp.com/attachments/906398525250760756/982779851314458685/unknown.png

There is an intermittent failure affecting me with those patches reverted. If I click exit and launch Rogue Company again, it launches without a problem, so this seems to be happening every other launch. It did not happen with a Ubuntu userland and I have not heard from GloriousEggroll about this affecting him on Fedora, so it might be unique to Gentoo ~amd64. Anyway, more R&D is needed. :/

ryao commented 2 years ago

Also, I posted on Valve's issue tracker about the current findings. In theory, this is enough for them to raise an issue with Epic's EAC development team so that this could hopefully be fixed. Presumably, it already has been fixed since no other game is affected by this and Rogue Company is being given an old unpatched version of EAC Proton support for some reason.

ryao commented 2 years ago

For those using glibc 2.34, here is a (untested) patch against glibc 2.34:

http://dpaste.com/62XQ2E6CB

In theory, you just need to:

1) Drop that into /etc/portage/patches/sys-libs/glibc-2.34-r13

2) Modify the ebuild to revert this:

@@ -1497,7 +1518,7 @@

        # first let's find the actual dynamic linker here
        # symlinks may point to the wrong abi
-       local newldso=$(find . -maxdepth 1 -name 'ld-*so' -type f -print -quit)
+       local newldso=$(find . -maxdepth 1 -name 'ld*so.?' -type f -print -quit)

        einfo Last-minute run tests with ${newldso} in /$(get_libdir) ...

3) Update the ebuild digest via ebuild $(equery which sys-libs/glibc-2.34-r13) digest.

4) Rebuild glibc.

Note that 2.34 version of that patch has not been tested yet. The 2.35 patch that I did test was generated by rebasing that against 2.35, so in theory it should be safe to use, but until someone has tested it, I suggest making sure that you have a way to rollback your system to a prepatch state (e.g. zfs rollback $SNAPSHOT) before using it.

GloriousEggroll commented 2 years ago

https://cdn.discordapp.com/attachments/906398525250760756/982779851314458685/unknown.png

There is an intermittent failure affecting me with those patches reverted. If I click exit and launch Rogue Company again, it launches without a problem, so this seems to be happening every other launch. It did not happen with a Ubuntu userland and I have not heard from GloriousEggroll about this affecting him on Fedora, so it might be unique to Gentoo ~amd64. Anyway, more R&D is needed. :/

this issue does not affect me on fedora.

Curve commented 2 years ago

This issue affects me on Arch - So not only gentoo is affected

GloriousEggroll commented 2 years ago

This issue affects me on Arch - So not only gentoo is affected

I just set up a clean arch system, and created new glibc packages using the commit 6abb4002df97df668f40b0da84ab6261498a8541 that I noted and the patches I provided, and I'm not hitting any issues. My guess here is something is up with ryao's patch or the commit Arch is using.

Here is my glibc pkgbuild files for Arch, there is also a folder with pre-compiled packages ready to go:

https://github.com/GloriousEggroll/glibc-eac-rc

As noted for those on Fedora: https://copr.fedorainfracloud.org/coprs/gloriouseggroll/glibc/

PixsaOJ commented 2 years ago

@GloriousEggroll usage instructions?

GloriousEggroll commented 2 years ago

@GloriousEggroll usage instructions?

Arch:

git clone https://github.com/GloriousEggroll/glibc-eac-rc
cd  glibc-eac-rc/compiled-packages
sudo pacman -U glibc-2*.zst lib32-glibc-2*.zst
reboot

Fedora:

sudo dnf copr enable gloriouseggroll/glibc 
sudo dnf update glibc glibc.i686

Also open /etc/yum.repos.d/fedora.repo and /etc/yum.repos.d/fedora-updates.repo and add

exclude=glibc libnsl
PixsaOJ commented 2 years ago

@GloriousEggroll Thank you very much, kind sir!

DissCent commented 2 years ago

I recently upgraded glibc on Manjaro/Arch from 2.35-6 to 2.36-2 since it was supposed to be working with EAC again. However I now cannot launch Rogue Company again, no matter which version of Proton I choose:

image

I also tried compiling glibc 2.36 using the PKGBUILD and patches provided by @GloriousEggroll but I still get the same error. Using the same Steam library with Flatpak works fine. Any ideas what could be wrong here?

Curve commented 2 years ago

I recently upgraded glibc on Manjaro/Arch from 2.35-6 to 2.36-2 since it was supposed to be working with EAC again. However I now cannot launch Rogue Company again, no matter which version of Proton I choose:

image

I also tried compiling glibc 2.36 using the PKGBUILD and patches provided by @GloriousEggroll but I still get the same error. Using the same Steam library with Flatpak works fine. Any ideas what could be wrong here?

As far as I know all games except Rogue Company and other games from the company behind it don't seem to work even with the latest glibc update - it may be something they do specifically that still results in the breakage we're seeing

DissCent commented 2 years ago

Alright, I just noticed that the DT_HASH patch and the patches needed for Rogue Company are needed because of different issues, so I cloned the PKGBUILD from the Arch Linux package 2.36-3, applied all the patches and the build was successful and Rogue Company is now running great again.

I created a Pull Request to the repository by @GloriousEggroll - if anyone is interested, you can find my forked repo here, including pre-compiled packages: https://github.com/DissCent/glibc-eac-rc

baryluk commented 2 years ago

I came here from https://github.com/Starz0r/AreWeAntiCheatYet/issues/444 , but I think this is a better place.

Doesn't work for me. Launcher says: "Launc Error, Failed to load the anti-cheat module", and EXIT button. Nothing in the logs.

I checked game directories, and for EAC there are only dlls, no linux .so files.

Debian testing. libc6 2.34-7 if that matters. Kernel 6.0.0-rc4

benjamin051000 commented 1 year ago

Bumping this issue: On Fedora 36 using steam flatpak, tried several vesrions of proton but EAC is giving me the same error, it won't start. Tried a couple different versions of proton and proton ge.

chewi commented 1 year ago

Bumping this issue: On Fedora 36 using steam flatpak, tried several vesrions of proton but EAC is giving me the same error, it won't start. Tried a couple different versions of proton and proton ge.

This repository is specifically for Gentoo. It's not Valve's fault either. You need to raise this with the Fedora developers.