Closed richard-broadhurst closed 3 years ago
Ok, we may need to know a little more about what al_lock_bitmap is doing behind the scenes. My desktop machine is an i5-3570K CPU @ 3.40GHz. Running under Linux b-em reports 100% speed and uses about 23% CPU. Running it under Windows under VirtualBox on the same machine B-em also reports 100% speed and Windows Task Manager reports B-Em as using about 25% CPU.
I'll check out the documentation. Can you see from the profiling info whether it is CPU limited in al_lock_bitmap or whether it seems to waiting for something.
There is a quirk that maybe relevant to this - on Linux, if I completely obscure the window B-Em is running in it slows down. This is noticeable if single stepping in the debugger where suddenly the time to execute a particular instruction becomes noticeable rather than immediate. Does it make any difference if you have it fully visible, fully obscured or partially obscured?
Partially obscuring the window makes no difference on Windows.
Changing the al_lock_bitmap() at the bottom of blit_screen() to ALLEGRO_LOCK_WRITEONLY puts it back to 100% (approx 120% available).
I haven't checked if this is safe, but as there should be nothing kept from the previous frame, it should be OK. This is a standard hardware accelerated graphics issue, where locking for read+write kills performance. Locking WRITEONLY (Lock Discard in DirectX) allows the driver to give you a new bit of memory while the hardware continues using the old one; downside, you need to write every pixel.
Thanks for that. I did experiment with that and it didn't make any difference on either of the two setups I can test (Linux or Windows 10 on Virtualbox) but if it is a useful improvement for you then we should adopt it.
The only thing I can think of to test first is whether the various display modes: interlace, scanlines, line-double all work correctly with this setting. I'll run a quick test but you obviously have a set up that I don't have - Windows 7 on bare metal.
That tests out fine on my two setups. Also does changing line 772 of video.c to add ALLEGRO_NO_PRESERVE_TEXTURE, so it becomes
al_set_new_bitmap_flags(ALLEGRO_VIDEO_BITMAP|ALLEGRO_NO_PRESERVE_TEXTURE);
make any difference? I don't mean instead of ALLEGRO_LOCK_WRITEONLY, but as well as. Is the maximum speed any higher?
I believe this is fixed by 2cefbfab45dc3c2e6b46332cfc953c308a48d14d
A quick profile shows 60% of the time going into gfx, most of which is in allegro and 30% is in region = al_lock_bitmap(b, ALLEGRO_PIXEL_FORMAT_ARGB_8888, ALLEGRO_LOCK_READWRITE); The perf is the same for release and debug and optimisations are 02. I'm using vs2019 and the head for today, version with 1462 submits. I wonder if there is an easy way to get allegro to double buffer and stop the lock taking so long. before allegro5, b-em was easy capable of 200% on my i5. All options in b-em are off, including nula which made no noticeable difference.