stardot / b-em

An opensource BBC Micro emulator for Win32 and Linux
http://stardot.org.uk/forums/viewtopic.php?f=4&t=10823
GNU General Public License v2.0
117 stars 58 forks source link

b-em only hits ~90% MODE 7 and 75% MODE 1 of the speed of a beeb according to its title bar #136

Closed richard-broadhurst closed 3 years ago

richard-broadhurst commented 3 years ago

A quick profile shows 60% of the time going into gfx, most of which is in allegro and 30% is in region = al_lock_bitmap(b, ALLEGRO_PIXEL_FORMAT_ARGB_8888, ALLEGRO_LOCK_READWRITE); The perf is the same for release and debug and optimisations are 02. I'm using vs2019 and the head for today, version with 1462 submits. I wonder if there is an easy way to get allegro to double buffer and stop the lock taking so long. before allegro5, b-em was easy capable of 200% on my i5. All options in b-em are off, including nula which made no noticeable difference.

SteveFosdick commented 3 years ago

Ok, we may need to know a little more about what al_lock_bitmap is doing behind the scenes. My desktop machine is an i5-3570K CPU @ 3.40GHz. Running under Linux b-em reports 100% speed and uses about 23% CPU. Running it under Windows under VirtualBox on the same machine B-em also reports 100% speed and Windows Task Manager reports B-Em as using about 25% CPU.

I'll check out the documentation. Can you see from the profiling info whether it is CPU limited in al_lock_bitmap or whether it seems to waiting for something.

There is a quirk that maybe relevant to this - on Linux, if I completely obscure the window B-Em is running in it slows down. This is noticeable if single stepping in the debugger where suddenly the time to execute a particular instruction becomes noticeable rather than immediate. Does it make any difference if you have it fully visible, fully obscured or partially obscured?

richard-broadhurst commented 3 years ago

Partially obscuring the window makes no difference on Windows.

Changing the al_lock_bitmap() at the bottom of blit_screen() to ALLEGRO_LOCK_WRITEONLY puts it back to 100% (approx 120% available).

I haven't checked if this is safe, but as there should be nothing kept from the previous frame, it should be OK. This is a standard hardware accelerated graphics issue, where locking for read+write kills performance. Locking WRITEONLY (Lock Discard in DirectX) allows the driver to give you a new bit of memory while the hardware continues using the old one; downside, you need to write every pixel.

SteveFosdick commented 3 years ago

Thanks for that. I did experiment with that and it didn't make any difference on either of the two setups I can test (Linux or Windows 10 on Virtualbox) but if it is a useful improvement for you then we should adopt it.

The only thing I can think of to test first is whether the various display modes: interlace, scanlines, line-double all work correctly with this setting. I'll run a quick test but you obviously have a set up that I don't have - Windows 7 on bare metal.

SteveFosdick commented 3 years ago

That tests out fine on my two setups. Also does changing line 772 of video.c to add ALLEGRO_NO_PRESERVE_TEXTURE, so it becomes

    al_set_new_bitmap_flags(ALLEGRO_VIDEO_BITMAP|ALLEGRO_NO_PRESERVE_TEXTURE);

make any difference? I don't mean instead of ALLEGRO_LOCK_WRITEONLY, but as well as. Is the maximum speed any higher?

SteveFosdick commented 3 years ago

I believe this is fixed by 2cefbfab45dc3c2e6b46332cfc953c308a48d14d