vc64web / virtualc64web

vc64web - Commodore C64 Emulator for iPad iPhone Android and the Web with CSDb access for thousands of demos at your fingertip.
https://vc64web.github.io/doc/about.html
GNU General Public License v3.0
44 stars 4 forks source link

optimizing and reducing energy impact #11

Closed mithrendal closed 4 years ago

mithrendal commented 4 years ago

I have several improvements in the queue ... this time energy ... still looking ahead for the vAmigaWeb ... which needs optimisations even more ...

Removal of that pixel rendering method via the windowsurface in case the host does not support hardware accelerated GPU texture rendering we will notice this now and fall back to a software texture renderer... So we got max. performance and high compatibility

Energy impact currently the mainloop is called 60 times in a second, hence 60fps ... it calls 60 times in a second c64->executeOneFrame(); is that neccessary ? and 60 times in a second it loads the screenbuffer c64->vic.screenBuffer(); into the GPU and draws it 60 times per second onto the screen

When we modify the drawOneFrameIntoSDL() method so that it skips every second frame so that it effectively renders 30 fps and it still looked the same as the 60 fps version 😎...

By lowering to 30fps we save approximately half the CPU activity hence energy πŸ€“... The fans of my oldest machine do not turn on anymore 😍

When we let run c64->executeOneFrame(); 60 fps but only render the screenbuffer 30 fps then it saves still one third ... which is also great !πŸ€—

I like to know how that is that handled in virtualC64? What is the recommended min. fps ? Is it valid to go by 30 fps only ? Am I eligable to do what I do ?... Sprite movments in fort apocalypse where absolutely smooth ...

dirkwhoffmann commented 4 years ago

I like to know how that is that handled in virtualC64?

The original emulator starts a 60Hz loop via the Metal API. In each iteration, the latest stable emulator texture is grabbed and fed into the GPU pipeline.

Is it valid to go by 30 fps only ?

To be honest, I don't expect texture copying to be a performance bottleneck. Your old code definitely was (I mean the code where you copied the texture manually pixel by pixel). But in the current implementation, you let the SDL do the copying (if I understand it correctly) and that should be fast. Nevertheless, it would be interesting to know how big the performance benefit really is. Just warp your old code with something like

counter++;
if (counter % 2) "copy the Texture"

and we'll see. If 30Hz really brings a noticeable performance benefit, it could be added as an option later. But even then, I think it should have low priority, because we should definitely keep the 60Hz for modern browsers.

BTW, I just encountered Javatari. Just played original PacMan in my browser a few minutes ago. This game was my very first contact with computers 😎. More than 40 years ago 😲.

Bildschirmfoto 2020-04-10 um 20 25 20
mithrendal commented 4 years ago
counter++;
if (counter % 2) "copy the Texture"

I did already ...

To be honest, I don't expect texture copying to be a performance bottleneck

on my old machine it did ... maybe very an old GPU ... lets test

when I call the
c64->executeOneFrame(); only 30 fps all is smooth, energy saving is even greater .... but sound a bit shabby πŸ™‰

Have to check modern computers like the one from 2014

2007 machine with blue start screen 60/60 fps -> 70% CPU skipping every 2nd texture drawing
60/30 fps -> 65% CPU sound ok skipping every 2nd c64->executeOneFrame(); and texture drawing
30/30 fps -> 40% CPU sound shabby hicups, but no fan noiseπŸ€“

2014 machine with blue start screen 60/60 fps -> 17% CPU
skipping every 2nd c64->executeOneFrame(); 60/30 fps -> 15% CPU sound ok skipping every 2nd c64->executeOneFrame(); and texture drawing 30/30 fps -> 8% CPU

If 30Hz really brings a noticeable performance benefit, it could be added as an option later.

energy saver, maybe for battery powered devices like ipad and iphone ... But what about the sound ? Any ideas why it is not as clean as 60 fps ?

To be honest, I don't expect texture copying to be a performance bottleneck.

yes, only 5% on the old mac and only 2% on the newer one, you were right πŸ™„...

but the skipping of the executeOneFrame() brings it down to 40 - 50 % ... still smooth video but with shabby sound do I have to increase the sample streaming buffer?

dirkwhoffmann commented 4 years ago

but the skipping of the executeOneFrame() brings it down to 40 - 50 % ...

Skipping executeOneFrame() every other frame means that you run the C64 at only 0.5 MHz 😯. Hence, sound is messed up. Video must be slower, too, but maybe it's not noticeable. This is also the reason why CPU load decreases so much πŸ˜ƒ.

mithrendal commented 4 years ago

Oh I see 😬... but but ... this is cool for games which are too fast or difficult 😎, we can play them in slow motion then.

grafik

mithrendal commented 4 years ago

πŸ₯Ί

can you explain this ?

Bildschirmfoto 2020-04-11 um 07 32 40

original vc64. I swear I have nothing done to it !🀞

dirkwhoffmann commented 4 years ago

can you explain this ?

😯 Nope. Maybe it's because you are using an external alien monitor called "S.O.N.Y"...

The main entrance point for drawing a single frame on the GPU is in MetalView.Swift:

    override public func draw(_ rect: NSRect) {

        if !enableMetal {
            return
        }

        // Wait until it's save to go ...
        semaphore.wait()

        // Refresh size dependent items if needed
        if layerIsDirty {
            reshape(withFrame: frame)
            layerIsDirty = false
        }

        // Draw scene
        drawable = metalLayer.nextDrawable()
        if drawable != nil {
            updateTexture()
            if fullscreen && !keepAspectRatio {
                drawScene2D()
            } else {
                drawScene3D()
            }
        }
    }

Could you insert the following lines at the very top of this function?

    let pfps = preferredFramesPerSecond
    track("\(pfps)")

On my machine, it returns a frame rate of 60. On your S.O.N.Y it might return 30? πŸ€”

Graphics programming is really some alien art and there are some fundamental things I still don't get. Let's say you have a laptop with a 60Hz TFT display. Now, let's connect this guy to an external TV with 50Hz and mirror the display. The question is: Is Metal calling MTKView::draw 60 times a second, or 50 times a second, or is the rendering frequency something completely different than the display frequencies of the monitor? πŸ€“

mithrendal commented 4 years ago

Skipping executeOneFrame() every other frame means that you run the C64 at only 0.5 MHz 😯. Hence, sound is messed up. Video must be slower, too, but maybe it's not noticeable. This is also the reason why CPU load decreases so much πŸ˜ƒ.

proofed grafik

without frameskip + skip execution 24 seconds

with frameskip and skip execution 12 seconds

the c64 is a real computing beast ... πŸ™„

dirkwhoffmann commented 4 years ago

the c64 is a real computing beast ... πŸ™„

Not only that. It's a perfect machine for quarantine. Here is a live stream from my home office:

5008f0c613259509a756b8588b245430

mithrendal commented 4 years ago

You look so lucky in the picture, you are really among the movers of the world with this executive 64. 😎 I want too!! (Is the postal address still valid... I might request more information about the executive 64 🀀)

Ok in executive language πŸ€ β˜•οΈ

 let pfps = preferredFramesPerSecond
 track("\(pfps)")

moved into position and executed ...

results: it tracked 60 but the vC64 this time reported in the bottom right corner 60Hz πŸ€”

wait maybe I know why. The s.o.n.y is a 4k guy, the mac mini 2014 can only handle 30Hz at that resolution. When I tested first I switched from 4k to 1080p to get the 60Hz. But vC64 then still showed 30Hz where it was supposed to show 60Hz (yes I completely restarted it).

Now that it shows the correct 60Hz ... I will switch it back to 4k 30Hz ...

okay we will see what happens now .... wait ...

it still tracks 60Hz as preferred and shows in the lower right edge 30Hz...

mithrendal commented 4 years ago

just pushed the latest hottest stuff 😎 look there is no red border on the right anymore I managed to clip it

grafik

bottom line: energy impact can not be reduced without quality loss... -that is when skipping rendering it is just 30 hz and not as fluent as is could be -when skipping execution of a frame emulation is a slower C64

maybe I did a bit for energy ... I skipped SDL_RenderClear(renderer); before I draw the Texture because the texture is rendered over the whole screen anyway so there is apparently no need to clear that screen before ... and I do only upload a clipped view of the screenbuffer maybe that saves a l little little little bit πŸ™„

does the VIC has a property or way which can tell us whether the buffer has changed at all ? When such a property exsists I can evaluate it and skip rendering of a frame depending of the value of it ... calculating a hashsum extra for this might be contra productive

dirkwhoffmann commented 4 years ago

just pushed the latest hottest stuff

Download in progress ... 🀀

Yeah, I like what I see... a perfect texture cutout... and fullscreen support in all browsers πŸ₯³

Bildschirmfoto 2020-04-12 um 07 50 36

The game asks me to press the joystick button. We don't have joystick support yet, do we? πŸ™„

Oh, look, other customer reviews are coming in...

cat

mithrendal commented 4 years ago

Oh yes man ... I already checked and tried out my controllers https://html5gamepad.com/ ... they are supported by html5 in firefox and safari according to this site even ios13 safari supports this. So it might be even possible to play on an ipad with a wireless connected XBox or playstation Controller πŸ˜„

next I have to implement the wasm_joystick ... interface should all be straight forward...

Does the VIC has a property or way which can tell us easily (with out comparing the buffer with the old buffer) whether the screenbuffer has changed at all ? Yes or No please... (I want to close the issue)

dirkwhoffmann commented 4 years ago

next I have to implement the wasm_joystick ... interface should all be straight forward...

Yes yes yes 🦾

whether the screenbuffer has changed at all ?

No, there is no chance of knowing that. VIC would have to do the comparison by itself (which is too time consuming). It simply renews the texture every frame.

mithrendal commented 4 years ago

No, there is no chance of knowing that. VIC would have to do the comparison by itself (which is too time consuming). It simply renews the texture every frame.

that is ok, then we close this issue ... BTW I am happy with performance in total ... but with the outlook to even game on a battery driven device, Smartphones or Tablets, with a connected game controller whe should also always try to check for energy waste, to prevent it where possible ... πŸ˜‡

mithrendal commented 4 years ago

time for the performance parcour of some other C64web versions who wins the race ?

todays match: macbook pro 2007 firefox c64 blue start screen performance 😎

start your engines ... πŸš™πŸš•πŸŽπŸš—

pure javascript (not fully compatible) at http://mborgbrant.github.io/c64js/ 90% load

vice2.4 codebase at https://vice.janicek.co/c64/index.html 103% load

wasm port of a not cycle exact therefore not fully compatible c64 emulator with 37% load

compared to that vC64web is a high compatible energy saver with only 70% load

dirkwhoffmann commented 4 years ago

Wow, I didn't know there is a web version of VICE. Hmm... well... don't know what I did here... πŸ₯΄

Bildschirmfoto 2020-04-15 um 15 19 36

But the controls are looking good 🀠.

Good to see that VirtualC64 performs better than VICE. (But VICE has rasterlines, VirtualC64 has not).

mithrendal commented 4 years ago

Do you mean it has no scan line effect or this https://www.google.de/amp/s/digitalerr0r.net/2011/04/30/commodore-64-programming-6-raster-interrupts/amp/ ?

dirkwhoffmann commented 4 years ago

Do you mean it has no scan line effect

Yes. Maybe VICE would be slightly faster if the effect is disabled. In VirtualC64, all graphics effects are applied inside the GPU pipeline and can't be activated in VirtualC64web. But 103% to 70% is really something πŸ˜ƒ. Forget about the non-cycle-exact C64 emulators... they are only capable of running very simple games.

mithrendal commented 4 years ago

I am curious whether vAmigaWeb can beat SAE performancewise ...

It might be a head to head race...

dirkwhoffmann commented 4 years ago

I am curious whether vAmigaWeb can beat SAE performancewise ...

Let's do a rough calculation...

Depending on the emulated game or demo, VirtualC64 usually runs with 3 MHz - 9 MHz in warp mode on my machine. vAmiga usually runs with 20 MHz - 60 MHz in warp mode (very rough estimates). Divided by the native frequency results is a warp factor of approx. 3 - 9 in both cases. Therefore I expect vAmigaWeb to range at 70%, too. 😎