ValveSoftware / Proton

Compatibility tool for Steam Play based on Wine and additional components
Other
23.36k stars 1.02k forks source link

X3D CPU performance optimization #7154

Open Baughn opened 9 months ago

Baughn commented 9 months ago

Feature Request

I confirm:

Description

On an X3D CPU, WINE_CPU_TOPOLOGY--if not already set--should be set to limit visible CPU cores to only those with V-cache available.

Here's an example Python script that would do it:

import subprocess
import xml.etree.ElementTree as ET
from collections import defaultdict

def run_lstopo():
    try:
        lstopo_output = subprocess.check_output(['lstopo', '--of', 'xml'], text=True)
        return ET.fromstring(lstopo_output)
    except Exception as e:
        print(f"An error occurred while running lstopo: {e}")
        return None

def parse_lstopo_xml_to_dict(root):
    core_to_cache = defaultdict(int)

    for l3cache in root.findall(".//object[@type='L3Cache']"):
        cache_size = int(l3cache.get("cache_size")) // (1024 * 1024)  # Converting to MB
        for pu in l3cache.findall(".//object[@type='PU']"):
            core_id = int(pu.get("os_index"))
            core_to_cache[core_id] = cache_size

    return core_to_cache

def filter_cores_by_max_cache(core_to_cache):
    max_cache = max(core_to_cache.values())
    return [core for core, cache in core_to_cache.items() if cache == max_cache]

if __name__ == "__main__":
    root = run_lstopo()
    if root:
        core_to_cache_dict = parse_lstopo_xml_to_dict(root)
        cores_with_max_cache = filter_cores_by_max_cache(core_to_cache_dict)

        # Sorting the core IDs
        sorted_cores_with_max_cache = sorted(cores_with_max_cache)

        # Generating the WINE_CPU_TOPOLOGY setting
        num_cores = len(sorted_cores_with_max_cache)
        core_ids_str = ",".join(map(str, sorted_cores_with_max_cache))
        wine_cpu_topology = f"WINE_CPU_TOPOLOGY=\"{num_cores}:{core_ids_str}\""

        print(f"Core to Cache mapping: {core_to_cache_dict}")
        print(f"Logical cores with max cache: {cores_with_max_cache}")
        print(f"Generated WINE_CPU_TOPOLOGY setting: {wine_cpu_topology}")

Justification [optional]

On an X3D CPU, for the vast majority of games, forcing the game to run only on the V-cache cores improves performance significantly. In some cases (e.g, Stationeers) this can be a 100% FPS improvement.

On Windows, AMD's game mode driver ensures this by shutting off (!) half the CPU, when a Steam game is run on a CPU with V-cache available on some but not all cores. (E.g, the 7950X3D.)

On Linux this can be done through taskset or by shutting off the cores through /sys/devices/cpu, but I got the best results by using WINE_CPU_TOPOLOGY to limit apparent hardware thread counts & core affinity to only the v-cache cores. This allows other processes to keep running, and allows worker thread tuning to match the actually available resources.

Risks [optional]

There's probably one or two games somewhere in the library that do better with all 16 cores available.

References [optional]

Appendix

FPS, measured in a mature Stationeers base under maximally CPU-hungry conditions.

Using taskset --cpu-list:

Using WINE_CPU_TOPOLOGY:

Bitwolfies commented 9 months ago

Question since you seem very knowing about this and I plan to purchase one of these CPU's myself, how does cutting off core access to the others improve things? Should it not use the vcache cores + the rest by default? Or is this just preventing it from randomly choosing the non cached cores.

Baughn commented 9 months ago

It's preventing it from randomly choosing the non-vcache cores, but also telling Proton that it should only use the vcache cores.

You could do the first half of that with taskset, but then Proton (and by extension the game) would still believe that all 16 cores are available, even though they're not. Using WINE_CPU_TOPOLOGY kills two geese with the same carrot.

Bitwolfies commented 9 months ago

It's preventing it from randomly choosing the non-vcache cores, but also telling Proton that it should only use the vcache cores.

You could do the first half of that with taskset, but then Proton (and by extension the game) would still believe that all 16 cores are available, even though they're not. Using WINE_CPU_TOPOLOGY kills two geese with the same carrot.

Gotcha, sounds like a great change if implemented, much better than the windows implementation of just killing half the cores outright through magical game detection. How does one go about using your script with Proton? Or is this an actual patch to the WINE_CPU_TOPOLOGY command?

Baughn commented 9 months ago

I don't know enough about Proton to say; that's why this is a feature request without a PR.

You can run the script as-is, and it'll output a WINE_CPU_TOPOLOGY line. You can then get that into the environment of Steam, by whatever means, and it'll affect every game you launch. I'm on NixOS, so I just set it in configuration.nix.

Bitwolfies commented 9 months ago

I don't know enough about Proton to say; that's why this is a feature request without a PR.

You can run the script as-is, and it'll output a WINE_CPU_TOPOLOGY line. You can then get that into the environment of Steam, by whatever means, and it'll affect every game you launch. I'm on NixOS, so I just set it in configuration.nix.

Ah, I get the purpose of the script now, nicely written, ill keep it around for when I get my CPU, thank you.

Sila-Secla commented 2 weeks ago

Hi, I switched to Linux (CachyOS) 1 Week ago. I'm only lightyears away from programming such a code myself ; ) . Especially eversince I'm sitting here now, and still asking myself how to implement that Code savely and correctly.