gonetz / GLideN64

A new generation, open-source graphics plugin for N64 emulators.
Other
778 stars 180 forks source link

Extremely low performance on limited hardware #1967

Open i30817 opened 5 years ago

i30817 commented 5 years ago

More than might be expected that is.

For instance, i can run (the retroarch version but also happens on standalone mupen) parellel64, which is pure software faster than mupen (retroarch or standalone). Even desmume, running (not the same game obviously but on the same conditions) feels faster. I had hoped that the recent fix for radeon open source had fixed this but if it did, it was only part of the problem for me.

Some measures:

Linux amd64

My graphics card and cpu are on battery mode (otherwise, overheat). Card is on dpm mode, low/battery.

cpu is limited to 1.2ghz:

cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 10.0 us.
  hardware limits: 1.20 GHz - 2.20 GHz
  available frequency steps: 2.20 GHz, 1.60 GHz, 1.20 GHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance, schedutil
  current policy: frequency should be within 1.20 GHz and 1.20 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1.20 GHz.
  cpufreq stats: 2.20 GHz:0,03%, 1.60 GHz:0,00%, 1.20 GHz:99,97%  (54)
analyzing CPU 1:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 1
  CPUs which need to have their frequency coordinated by software: 1
  maximum transition latency: 10.0 us.
  hardware limits: 1.20 GHz - 2.20 GHz
  available frequency steps: 2.20 GHz, 1.60 GHz, 1.20 GHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance, schedutil
  current policy: frequency should be within 1.20 GHz and 1.20 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1.20 GHz.
  cpufreq stats: 2.20 GHz:0,03%, 1.60 GHz:0,00%, 1.20 GHz:99,97%  (35)

This is a late reading from the banjo kazooie intro (the emulator starts to slow down as soon as the 'circle' starts growing, briefly becomes 'fast enough' for the sound not to be delayed when the image is all white during the transition from the first part of the intro where the Nintendo N is moving around and the rest and returns to slow after) with nothing else open but a shell, gnome-session perf top and mupen standalone:

    16.46%  [kernel]                       [k] acpi_processor_ffh_cstate_enter
     2.71%  libsamplerate.so.0.1.8         [.] 0x00000000000035d5
     2.36%  mupen64plus-video-GLideN64.so  [.] _Z9RasterizeP7vertexiii
     1.92%  libsamplerate.so.0.1.8         [.] 0x000000000000366c
     1.51%  libsamplerate.so.0.1.8         [.] 0x00000000000036a7
     1.41%  libsamplerate.so.0.1.8         [.] 0x000000000000360f
     1.03%  libc-2.27.so                   [.] __memcpy_ssse3
     1.02%  perf                           [.] 0x000000000029c093
     0.92%  [kernel]                       [k] read_hpet
     0.92%  libsamplerate.so.0.1.8         [.] 0x00000000000036ab
     0.89%  libsamplerate.so.0.1.8         [.] 0x0000000000003613
     0.88%  libmupen64plus.so.2            [.] dyna_jump
     0.82%  libsamplerate.so.0.1.8         [.] 0x0000000000003672
     0.77%  libsamplerate.so.0.1.8         [.] 0x0000000000003603
     0.74%  libsamplerate.so.0.1.8         [.] 0x0000000000003693
     0.74%  libmupen64plus.so.2            [.] dynarec_jump_to_recomp_address
     0.64%  libsamplerate.so.0.1.8         [.] 0x00000000000035fb
     0.63%  libsamplerate.so.0.1.8         [.] 0x000000000000369b
     0.62%  mupen64plus-video-GLideN64.so  [.] _ZN18ColorBufferToRDRAM5_copyEj
     0.59%  perf                           [.] 0x00000000001ea447
�[H�[2J
   PerfTop:    4338 irqs/sec  kernel:38.5%  exact:  0.0% [4000Hz cycles:pp],  (all, 2 CPUs)

this is desmume running order of Ecclesia, just before game play when the narrator is speaking about the order on the stained glass window text roll:

     9.36%  [kernel]          [k] acpi_processor_ffh_cstate_enter
     1.26%  desmume           [.] 0x000000000011029f
     1.21%  libc-2.27.so      [.] __memcpy_ssse3
     1.02%  desmume           [.] 0x0000000000207f66
     0.61%  desmume           [.] 0x00000000001ba3f0
     0.59%  desmume           [.] 0x0000000000207502
     0.54%  desmume           [.] 0x000000000017f922
     0.47%  desmume           [.] 0x00000000001109e0
     0.44%  desmume           [.] 0x0000000000207f73
     0.43%  desmume           [.] 0x00000000001bb354
     0.42%  [kernel]          [k] read_hpet
     0.39%  desmume           [.] 0x000000000017eb92
     0.36%  desmume           [.] 0x0000000000207494
     0.35%  desmume           [.] 0x00000000001c2495
     0.35%  desmume           [.] 0x000000000014fb32
     0.35%  desmume           [.] 0x000000000017f929
     0.34%  desmume           [.] 0x00000000001bb346
     0.33%  [kernel]          [k] memset
     0.32%  desmume           [.] 0x0000000000207c8b
     0.31%  desmume           [.] 0x00000000001102a9
�[H�[2J
   PerfTop:    6899 irqs/sec  kernel:25.2%  exact:  0.0% [4000Hz cycles:pp],  (all, 2 CPUs)

I know the emulators aren't directly comparable and the N64 clock speed is even slightly superior to the DS, but i still find it strange how much faster is parellel64 than mupen, so i feel something is 'wrong'. I also find it a bit strange that a library for sample rate conversions would take more cpu time than the graphical plugin, though i suppose the GPU being on low power profile might be delaying the emulator enough for something weird to happen to sound.

I tried the angrylion plugin on standalone but it was even slower, so at least there is nothing surprising there.

gonetz commented 5 years ago

I tried the angrylion plugin on standalone but it was even slower, so at least there is nothing surprising there.

Can you check glide64, glide64mk2 or rice? If these plugins are slow too, then the problem is not on GLideN64 side. Also, please try to run GLideN64 with frame buffer emulation disabled. Some frame buffer emulation options can be too heavy for your system.

i30817 commented 5 years ago

I'd have to build them, they're not distributed on the site?

i30817 commented 5 years ago

Ok i downloaded a older version of the emulator (mupen64plus-bundle-linux64-2.5-ubuntu) and copied over the glide64mk2 and rice plugins. Rice was indeed much faster on Banjo Kazooie intro (but had many 'jerk animation backwards in time' errors, glide64mk2 the emulator though it wasn't a plugin when starting a game, but still could be selected (funny enough sound started for a while if you tried and then lost focus on the window).

comparison:

mupen64plus-video-rice.so reports:

60.000-60.200 VI/s (with the setting on on the top bar, it does have on screen notifications the same as glide).

mupen64plus-video-GlideN64.so reports:

23% 14 VI/S 7 FPS enabling or disabling frame buffer emulation (and i tested many other settings too) makes no difference.

i30817 commented 5 years ago

I think this might be a memory leak. I used heaptrack https://github.com/KDE/heaptrack and it segfaulted when loading a rom (when normal it just becomes ultra slow and not crash). The result of the --analyze is here:

https://gist.github.com/i30817/4a68e185ed2437c2cc3331ef95c9392b

hope it helps.

edit: then again it may just have crashed because heaptrack requires changing the system allocator, though the 600 mb lost are sorta suspicious.

screenshot from 2018-12-17 15-57-35

gonetz commented 5 years ago

Not much total memory leaked: 562.19MB it is huge leak, but I can't get from the report the source of the leaks. I'm not familiar with that tool. I usually use valgrind, which can track where memory was leaked and how much. And yes, 600 mb lost is too suspicious.

From the other side, it is unclear why GLideN64 is so slow in compare with RiceVideo. Sorry for stupid question, but do you test release build of GLideN64? Debug one is used to be slow even on my desktop.

i30817 commented 5 years ago

i used the linux 64 bits plugin and emulator from this site https://m64p.github.io/

It's actually kind of confusing there are that many sites distributing it.

I attached a new image showing the allocation is a huge jump all (or nearly) at once on the previous post, but it didn't work so here it is:

screenshot from 2018-12-17 16-00-05

orbea commented 5 years ago

I'm not sure I see much specific to GLideN64 there, its mostly noise from gtk and qt being broken, I suppose as a result from using mupen64plus-gui. It would be easier to debug with the mupen64plus console ui or even with RetroArch to avoid all that noise....

https://github.com/mupen64plus/mupen64plus-ui-console

gonetz commented 5 years ago

I can't explain it. GLideN64 can't eat so much, otherwise it would never start on most of Android devices. May be it is QT indeed.

gonetz commented 5 years ago

Could you build GLideN64 from sources to be sure that it is not a bad build problem?

i30817 commented 5 years ago

I'm going to try with retroarch because it was just as slow and i can start retroarch cores from the cmd line (unlike the download from that site) if you don't mind.

orbea commented 5 years ago

I would try the mupen64plus console ui, the libretro core is pretty out of date...

i30817 commented 5 years ago

Well, i had already tried to build mupen but Makefile:152: *** Mupen64Plus API header files not found! Use makefile parameter APIDIR to force a location.. Stop.

i30817 commented 5 years ago

Retroarchs heaptrack curiously crashes in the same way but the memory leak is a order of magnitude smaller, 60mb instead of 560.

Problem is that it's nearly all in 'unresolved function' below __libc_start_main. I think i really need to compile it but i need headers. And I might be chasing just mupen not liking the allocator.

gonetz commented 5 years ago

I don't think that building mupen from sources is so necessary since Rice video works ok for you. Also, you may install mupen core and other parts from packages. They can be outdated too, but hardly too old to run. I suggest to build (release!) GLideN64 from sources and set it explicitly in mupen64plus parameters with --gfx .

orbea commented 5 years ago

Its a pretty annoying build honestly...

Basically you need to clone these repos.

https://github.com/mupen64plus/mupen64plus-core https://github.com/mupen64plus/mupen64plus-ui-console https://github.com/mupen64plus/mupen64plus-input-sdl https://github.com/mupen64plus/mupen64plus-rsp-hle https://github.com/mupen64plus/mupen64plus-audio-sdl

Then you can build them one by one (Make sure to build mupen64plus-core first), for example.

cd mupen64plus-core/projects/unix
make all
make install

Then for the following plugins you pass APIDIR to make all to point to the mupen64plus headers, by default they are installed to /usr/local/mupen64plus, but maybe you can skip make install and point to src/api/ in the mupen64plus-core repo?

See the output of make to see the full list of arguments when building each plugin.

As for GLideN64 it uses a standard cmake build.

https://github.com/gonetz/GLideN64

i30817 commented 5 years ago

Can i make uninstall on all of those? I'm not a fan of getting aleatory library headers into my system for apt to freak out later and i was burned before by 'just install from source' procedures.

I guess i should try to find a system install of mupen64plus-console before trying that.

orbea commented 5 years ago

They should support make uninstall and as I explained, you might be able to skip make install.

i30817 commented 5 years ago

I give up, i don't know how to read this.

You can see the allocations it by installing heaptrack-gui and calling heaptrack_gui on this gz. I built the repos above, and built the debug version of the gfx plugin, make install all of them except the gfx plugin and invoked it with heaptrack mupen64plus --gfx ./mupen64plus-video-GLideN64.so --audio /usr/local/lib/mupen64plus/mupen64plus-audio-sdl.so --input /usr/local/lib/mupen64plus/mupen64plus-input-sdl.so --rsp /usr/local/lib/mupen64plus/mupen64plus-rsp-hle.so Banjo-Kazooie\ \(USA\)\ \(Rev\ A\).n64

I am running on Gnome 3 Wayland ubuntu with a very old ati card r710 (mobile version).

heaptrack.mupen64plus.27803.gz

It gives images like this, but won't crash without the heaptrack (just be slow as heck): screenshot from 2018-12-17 18-24-02

Funny enough the '500mb' leak returns even without QT. Supposedly.

screenshot from 2018-12-17 18-32-38

CoreStartup has a function call init_mem_base() which seems a prime candidate for the screw up.

i30817 commented 5 years ago

I'm actually confused about you mentioning that 512 mb is 'too much' because init_mem_base()' seems to request that by default:

MB_MAX_SIZE = MB_PIF_MEM    + PIF_ROM_SIZE + PIF_RAM_SIZE
MB_MAX_SIZE_FULL = 0x20000000

....

void* init_mem_base(void)
{
    void* mem_base;

    /* First try the full mem base alloc */
    mem_base = malloc(MB_MAX_SIZE_FULL);
    if (mem_base == NULL) {
        /* if it failed, try the compressed mem base alloc */
        mem_base = malloc(MB_MAX_SIZE);
        if (mem_base != NULL) {
            /* Compressed mem base mode has LSB = 1 */
            assert(MEM_BASE_MODE(mem_base) == 0);
            SET_MEM_BASE_MODE(mem_base);
            DebugMessage(M64MSG_INFO, "Using compressed mem base");
        }
    }
    else {
        /* Full mem base mode has LSB = 0 */
        assert(MEM_BASE_MODE(mem_base) == 0);
        DebugMessage(M64MSG_INFO, "Using full mem base");
    }

    return mem_base;
}

Anyway, heaptrack crashes when the app requests that on top of their tracking, not very surprising. And since linux has paging, it's also not surprising that it 'succeeds' and starts to use memory horribly.


free --mega
              total        used        free      shared  buff/cache   available
Mem:           4035        2042         549         111        1443        1592
Swap:          2051          48        2003

This might not explain the slowness but just the heaptrack crash. I'm pretty sure i noticed the slowness without anything else open.

edit: even when firefox is closed:

free --mega
              total        used        free      shared  buff/cache   available
Mem:           4035        1147        1518          29        1369        2569
Swap:          2051          48        2003

it crashes with the same '540 leak' so i guess heaptrack is much more demanding of memory than i expected or is crashing from another bug.

i30817 commented 5 years ago

Ok, the crash was a read herring. I managed to 'mostly fix it' by setting EnableLegacyBlending = True

So it was a shader slowdown. Meanwhile by profiling with perf and recompiling i found that in my system the noise generator works faster with one thread in spite of being dual core, uh.

edit: though, ofc it doesn't help in retroarch, why should it.

gonetz commented 5 years ago

Well, shaders created by GLideN64 are quite heavy. It is payment for accuracy. N64 hardware differs from PC one in so many aspects that lots of functionality have to be emulated in pixel shaders. You may also disable mip-mapping emulation and get some performance boost in some games.

ghost commented 5 years ago

I experimented years ago with doing noise on a GPU level, skipping that problem with speed entirely. However, the problem was getting random enough results, but that could be worked on.

i30817 commented 5 years ago

I'm actually a bit shocked about how a cpu on low power mode is so much less of a problem than a gpu. This situation - the software renderer version of a emulator being much faster than the gpu renderer with both the cpu and gpu at their lowest and the emu on lowest settings also repeats with retroarch's beetle hw and beetle sw. Beetle HW in minimal settings with speeds like 7-12 fps and software runs nearly full speed 55 fps in gameplay.

It's a bit worrying to be honest. Then there are outliers like dolphin of all things that can run resident evil Remake at 20fps in the same machine right after. I just don't know, maybe devs should profile on these states to see if there is something pathological going on like excessive GPU/CPU back and forth reading that multiplies the slowness factors that they can't notice because of great cards.

gonetz commented 5 years ago

Many people think that if some emulator emulates much more powerful hardware than N64 then it needs much more powerful PC hardware than the one for N64 emulators. It is not necessary so. Well, pure software pixel accurate N64 emulator should be faster than similar GameCube emulator. The situation changes when we use PC graphics card to render graphics. N64 emulators still can be much faster, but only until they emulate N64 hardware using mainly Fixed Function Pipeline. Glide64, glN64 are the examples. There are plenty of features of N64 hardware, which can not be properly emulated with Fixed Function Pipeline. Pixel shaders are necessary to emulate these features. GLideN64 uses many different pixel shaders. These shaders often use calculations, which older PC cards can't do efficiently. So, while N64 games use much less polygons than GC games, proper rendering of these polygons may require much more calculations on GPU side. GLideN64 devs already profiled shaders code, and many optimizations have been done. I'm sure that there are still rooms for optimization, but GLideN64 definitely is not for low-end hardware.

ghost commented 5 years ago

AFAIK one of the heaviest bits is the N64 depth image emulation, thanks to loadimagestore and its heavy use of sync. Could be wrong tho. Didn't you @gonetz try using plain FBOs and plain 16bit unsigned short depth texture attachments in the past for N64 depth rendering? Surely OGL supports such textures instead of using image2d.

gonetz commented 5 years ago

Image textures are used only when N64 depth compare option enabled. Otherwise plugin uses plain FBO with plain depth texture attachment.

ytrezq commented 4 years ago

More than might be expected that is.

For instance, i can run (the retroarch version but also happens on standalone mupen) parellel64, which is pure software faster than mupen (retroarch or standalone). Even desmume, running (not the same game obviously but on the same conditions) feels faster. I had hoped that the recent fix for radeon open source had fixed this but if it did, it was only part of the problem for me.

Some measures:

Linux amd64

My graphics card and cpu are on battery mode (otherwise, overheat). Card is on dpm mode, low/battery.

cpu is limited to 1.2ghz:

cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 10.0 us.
  hardware limits: 1.20 GHz - 2.20 GHz
  available frequency steps: 2.20 GHz, 1.60 GHz, 1.20 GHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance, schedutil
  current policy: frequency should be within 1.20 GHz and 1.20 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1.20 GHz.
  cpufreq stats: 2.20 GHz:0,03%, 1.60 GHz:0,00%, 1.20 GHz:99,97%  (54)
analyzing CPU 1:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 1
  CPUs which need to have their frequency coordinated by software: 1
  maximum transition latency: 10.0 us.
  hardware limits: 1.20 GHz - 2.20 GHz
  available frequency steps: 2.20 GHz, 1.60 GHz, 1.20 GHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance, schedutil
  current policy: frequency should be within 1.20 GHz and 1.20 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1.20 GHz.
  cpufreq stats: 2.20 GHz:0,03%, 1.60 GHz:0,00%, 1.20 GHz:99,97%  (35)

This is a late reading from the banjo kazooie intro (the emulator starts to slow down as soon as the 'circle' starts growing, briefly becomes 'fast enough' for the sound not to be delayed when the image is all white during the transition from the first part of the intro where the Nintendo N is moving around and the rest and returns to slow after) with nothing else open but a shell, gnome-session perf top and mupen standalone:

    16.46%  [kernel]                       [k] acpi_processor_ffh_cstate_enter
     2.71%  libsamplerate.so.0.1.8         [.] 0x00000000000035d5
     2.36%  mupen64plus-video-GLideN64.so  [.] _Z9RasterizeP7vertexiii
     1.92%  libsamplerate.so.0.1.8         [.] 0x000000000000366c
     1.51%  libsamplerate.so.0.1.8         [.] 0x00000000000036a7
     1.41%  libsamplerate.so.0.1.8         [.] 0x000000000000360f
     1.03%  libc-2.27.so                   [.] __memcpy_ssse3
     1.02%  perf                           [.] 0x000000000029c093
     0.92%  [kernel]                       [k] read_hpet
     0.92%  libsamplerate.so.0.1.8         [.] 0x00000000000036ab
     0.89%  libsamplerate.so.0.1.8         [.] 0x0000000000003613
     0.88%  libmupen64plus.so.2            [.] dyna_jump
     0.82%  libsamplerate.so.0.1.8         [.] 0x0000000000003672
     0.77%  libsamplerate.so.0.1.8         [.] 0x0000000000003603
     0.74%  libsamplerate.so.0.1.8         [.] 0x0000000000003693
     0.74%  libmupen64plus.so.2            [.] dynarec_jump_to_recomp_address
     0.64%  libsamplerate.so.0.1.8         [.] 0x00000000000035fb
     0.63%  libsamplerate.so.0.1.8         [.] 0x000000000000369b
     0.62%  mupen64plus-video-GLideN64.so  [.] _ZN18ColorBufferToRDRAM5_copyEj
     0.59%  perf                           [.] 0x00000000001ea447
�[H�[2J
   PerfTop:    4338 irqs/sec  kernel:38.5%  exact:  0.0% [4000Hz cycles:pp],  (all, 2 CPUs)

this is desmume running order of Ecclesia, just before game play when the narrator is speaking about the order on the stained glass window text roll:

     9.36%  [kernel]          [k] acpi_processor_ffh_cstate_enter
     1.26%  desmume           [.] 0x000000000011029f
     1.21%  libc-2.27.so      [.] __memcpy_ssse3
     1.02%  desmume           [.] 0x0000000000207f66
     0.61%  desmume           [.] 0x00000000001ba3f0
     0.59%  desmume           [.] 0x0000000000207502
     0.54%  desmume           [.] 0x000000000017f922
     0.47%  desmume           [.] 0x00000000001109e0
     0.44%  desmume           [.] 0x0000000000207f73
     0.43%  desmume           [.] 0x00000000001bb354
     0.42%  [kernel]          [k] read_hpet
     0.39%  desmume           [.] 0x000000000017eb92
     0.36%  desmume           [.] 0x0000000000207494
     0.35%  desmume           [.] 0x00000000001c2495
     0.35%  desmume           [.] 0x000000000014fb32
     0.35%  desmume           [.] 0x000000000017f929
     0.34%  desmume           [.] 0x00000000001bb346
     0.33%  [kernel]          [k] memset
     0.32%  desmume           [.] 0x0000000000207c8b
     0.31%  desmume           [.] 0x00000000001102a9
�[H�[2J
   PerfTop:    6899 irqs/sec  kernel:25.2%  exact:  0.0% [4000Hz cycles:pp],  (all, 2 CPUs)

I know the emulators aren't directly comparable and the N64 clock speed is even slightly superior to the DS, but i still find it strange how much faster is parellel64 than mupen, so i feel something is 'wrong'. I also find it a bit strange that a library for sample rate conversions would take more cpu time than the graphical plugin, though i suppose the GPU being on low power profile might be delaying the emulator enough for something weird to happen to sound.

I tried the angrylion plugin on standalone but it was even slower, so at least there is nothing surprising there.

Considering a Gamecube can run almost all N64 roms at their original speed (using an official emulator from Nintendo though), what system do you have? Also worth noting that rendering on the GameCube/Wii is as much differrent from a pc than from the Nintendo 64 and know nothings that looks like shaders.

ytrezq commented 4 years ago

I'm actually a bit shocked about how a cpu on low power mode is so much less of a problem than a gpu. This situation - the software renderer version of a emulator being much faster than the gpu renderer with both the cpu and gpu at their lowest and the emu on lowest settings also repeats with retroarch's beetle hw and beetle sw. Beetle HW in minimal settings with speeds like 7-12 fps and software runs nearly full speed 55 fps in gameplay.

It's a bit worrying to be honest. Then there are outliers like dolphin of all things that can run resident evil Remake at 20fps in the same machine right after. I just don't know, maybe devs should profile on these states to see if there is something pathological going on like excessive GPU/CPU back and forth reading that multiplies the slowness factors that they can't notice because of great cards.

What s even funnier is how I use the official virtual console (which also jit roms) along android s version of Dolphin in order to get decent speeds (15 to 20 fps) on my Tablet (through underclocking the emulated cpu at 30% of it s original speed though because adding a second jit level tackle jit caching and in turns speed).

i30817 commented 4 years ago

what system do you have?

It's a 13 years first gen core duo mobile, so basically a piece of shit using the mesa drivers for the amd card, which i have to further underclock not to overheat, so the following values should say '1.3ghz' for the cpu and who knows for the gpu (lowest state possible for the thing).

retroarch says:

[INFO] CPU Model Name: Intel(R) Core(TM)2 Duo CPU T6600 @ 2.20GHz [INFO] Capabilities: MMX MMXEXT SSE SSE2 SSE3 SSSE3 SSE4 [INFO] [GL]: Vendor: X.Org, Renderer: AMD RV710 (DRM 2.50.0 / 5.3.0-62-generic, LLVM 9.0.0). [INFO] [GL]: Version: 3.0 Mesa 19.2.8.