cdev-tux / q3lite

Q3lite, an OpenGL ES port of Quake III Arena for embedded Linux systems.
GNU General Public License v3.0
93 stars 17 forks source link

Building on Mali platform #6

Closed hissingshark closed 6 years ago

hissingshark commented 6 years ago

Hi, I have an armv8 board (same CPU as the RPi3) with a Mali GPU. SDL2 (2.0.8) is built and working with GLES2 in framebuffer, for various emulators. As this isn't Broadcom I've got libEGL.so, libGLESv1_CM.so and libGLESv2.so to build against.

I still have my old PC copy of Q3A I'm very curious to see if I can get this port running.

Questions:

  1. I seem to have 2 client builds, both of which I've tried. At runtime I'm getting:

./quake3_opengl2.aarch64

Compiled with SDL v2.0.8 Linking against SDL v2.0.8 ----- Client Initialization Complete ----- ----- R_Init ----- SDL using driver "mali" Initializing OpenGL display Display aspect: 1.778 ...setting mode -2: 1920 1080 Trying to get an OpenGL 3.2 core context SDL_GL_CreateContext failed: Could not create EGL context (call to eglCreateContext failed, reporting an error of EGL_BAD_ATTRIBUTE) Reverting to default context ----- Client Shutdown (Client fatal crashed: Unsupported OpenGL Version: OpenGL ES 2.0

or with

./quake3.aarch64

Compiled with SDL v2.0.8 Linking against SDL v2.0.8 ----- Client Initialization Complete ----- ----- R_Init ----- SDL using driver "mali" Initializing OpenGL display Display aspect: 1.778 ...setting mode 3: 640 480 ----- Client Shutdown (Client fatal crashed: Unsupported OpenGL Version: OpenGL ES 2.0

Which is the intended binary?

  1. The wiki states "Q3lite is currently compatible with SDL2-2.0.4 only. Newer versions of SDL2 will have subtle issues due to incompatibilities with the Q3lite OpenGL ES 1.1 renderer." That's fine, I can build an older version of SDL2 (I've been building it since 2.0.2!), but does this mean it's only supporting GLES1.1 and not GLES2? I thought I'd give it a try as there seemed to be a build option for it in make-raspberrypi.sh which I borrowed as BUILD_RENDERER_OPENGL2=1

  2. Should I be using the includes I built SDL2 with, or the ones you've supplied under core/SDL2/inlcudes?

Many thanks for any advice.

hissingshark commented 6 years ago

OK, I've found the check in the source that gives the error, so GLESv2 is out. But odd that it's not picking up ES1. Just double checked by SDL2 build.

SDL2 Configure Summary: Building Shared Libraries Building Static Libraries Enabled modules : atomic audio video render events joystick haptic power filesystem threads timers file loadso cpuinfo assembly Assembly Math : Audio drivers : disk dummy oss alsa Video drivers : dummy opengl_es1 opengl_es2 vulkan mali Input drivers : linuxev linuxkd Using libsamplerate : YES Using libudev : YES Using dbus : YES Using ime : YES Using ibus : YES Using fcitx : YES

cdev-tux commented 6 years ago

Hello, I’d be happy to help you out. I had a very early version of Q3lite running on an Odroid XU4 (mali), although it wasn’t fully functional.

  1. As you determined, the detection routine is not properly detecting your version of OpenGL ES, so it throws an error. You’ll need to modify the detection routine to identify your OpenGL ES version string and tell it to load OpenGL ES 1.1 functions at run time. I’m assuming that you added the mali driver to SDL, is that correct? If so, you’ll need to verify that it’s compatible with OpenGL ES 1.1. The OpenGL ES renderer in Q3lite currently supports OpenGL ES 1.1 only.
  2. I need to update the wiki to correct the information on SDL version compatibility. Q3lite still requires SDL 2.0.4, but it’s because if a bug in SDL, not an incompatibility with the OpenGL ES 1.1 renderer. There’s a pending bug report with SDL to fix the issue, but it’s stalled at the moment. The issue with SDL versions newer than 2.0.4 is that after you press either Alt or CTRL key in-game, you won’t be able to type any text into the console or chat (say command). I suspect that this issue affects Linux versions of SDL on other platforms as well. The BUILD_RENDERER_OPENGL2 setting can’t be used as it refers to the new renderer in ioquake3 that requires full OpenGL support. https://github.com/ioquake/ioq3/blob/master/opengl2-readme.md
  3. If you want Q3lite to use the SDL libraries and includes installed on your system rather than the 2.0.4 compatible ones, edit the make-raspberrypi.sh file and set the following setting near the end of the file: Q3LITE_INSTALL_SDL=0 Q3lite will now use the SDL libraries and includes installed in your system directory /usr/local/lib and /usr/local/include/SDL2. You can verify which libraries are being used at compile time by looking at the CLIENT_LIBS section when Q3lite begins to compile. You can change the setting back to 1 if you want Q3lite to revert to using the compatible SDL 2.0.4 libraries. The version of SDL includes that you use should match the SDL library version that you’re using. The include files in the Q3lite source directory code/SDL2/include are for use with SDL 2.0.4.

I should also mention that you’ll need to modify Makefile.q3lite to point to your EGL/GLES library names and their correct pathnames.

Let me know if you have any additional questions. Hope that helps.

hissingshark commented 6 years ago

Thanks for getting back to me. All helpful insights. I shall make those suggested changes as a matter of form and it sounds like 2.0.8 will do then.

In the meantime I'd rebuilt SDL2 @ 2.0.4 with only --enable-video-gles1 and --disable-video-gles2 to force the issue. That circumvented the check issue and brings us to this:

QKEY found. Compiled with SDL v2.0.4 Linking against SDL v2.0.4 ----- Client Initialization Complete ----- ----- R_Init ----- SDL using driver "mali" Initializing OpenGL display Display aspect: 1.778 ...setting mode 3: 640 480 Using 16 color bits, 16 depth, 8 stencil display. Available modes: '1920x1080' GL_RENDERER: Mali-450 MP Initializing OpenGL extensions ...GL_EXT_texture_compression_s3tc not found ...GL_S3_s3tc not found ...GL_EXT_texture_env_add not found ...GL_ARB_multitexture not found ...GL_EXT_compiled_vertex_array not found ...GL_EXT_texture_filter_anisotropic not found tty]Segmentation fault

Those appear to be GL extensions tested for by code/sdl/sdl_glimp.c but I do not find them in my include/GLES/glext.h. Perhaps they are optional/vendor-specific? That would be a bigger problem I fear.

hissingshark commented 6 years ago

Sorry I'm monologing.

I need to start afresh with the build configuring:

ifdef HAVE_GLES

define R_MODE_FALLBACK -2 // Desktop resolution

else

define R_MODE_FALLBACK 3 // 640 * 480

endif

Then why do I get

...setting mode 3: 640 480 ... Available modes: '1920x1080'

Because HAVE_GLES hasn't been defined. I'm going to re-clone the repo and take it from the top.

EDIT: Those include guards would also effect at least some of those failed extension tests being performed too...

cdev-tux commented 6 years ago

I have a few questions that will help me understand what environment you’re working with. Which model of embedded computer are you using? Does the video chip use tiled rendering like the Pi does?

I would try compiling SDL 2.0.4 with the following configure flags:

./configure --host=arm-raspberry-linux-gnueabihf \                 --enable-alsa \                 --disable-alsa-shared \                 --disable-pulseaudio \                 --disable-esd \                 --disable-video-mir \                 --disable-video-wayland \                 --disable-video-x11 \                 --disable-video-vivante \                 --disable-video-gles2 \                 --disable-video-opengl

You won’t find the GL extensions in the include/GLES/glext.h header file because they’re full OpenGL extensions. The ioquake3 code that Q3lite is based on runs full OpenGL by default. I added an OpenGL ES renderer from another project to create this fork. So that’s why you’ll see the ‘not found’ messages, which are harmless and can be safely ignored. Some of those extensions may be used in the future as the Raspberry Pi transitions to the new VC4 OpenGL driver.

The ‘HAVE_GLES’ compiler preprocessor is set at compile time in the Makefile.q3lite file. When Q3lite begins to compile it will list the compiler settings. You should see –DHAVE_GLES under the CFLAGS section to verify that it’s set.

I’m not sure why it’s defaulting to 640 x 480 resolution, maybe because you may not be compiling SDL with the --disable-video-opengl flag. The resolution might be set elsewhere in the code because it’s trying to load OpenGL instead of OpenGL ES.

If your video chip uses tile based rendering, then the GL_ARB_multitexture extension will need to be enabled in code/sdl/sdl_glimp.c. The multitexture detection code in sdl_glimp.c has been modified to work with the Raspberry Pi, and may need to be changed to function with other video chips. You can hard-code the number of texture units for testing purposes. The Pi uses 4 texture units.

Give those things a try and let me know how it goes.

Thanks

hissingshark commented 6 years ago

So I've done a --hard RESET. Rebuilt my SDL2 v.2.0.8 (I'll just try not to press Alt or Ctrl for now) Overriddden the Makefile.q3lite with: CFLAGS = -DHAVE_GLES -I/opt/vero3/include -O3 -march=armv8-a+crc -mtune=cortex-a53 -mfpu=neon-fp-armv8 -mfloat-abi=hard -ftree-vectorize -funsafe-math-optimizations LDFLAGS= -lpthread -L/opt/vero3/lib -lEGL -lGLESv2

It builds. It runs.

QKEY found. Compiled with SDL v2.0.8 Linking against SDL v2.0.8 ----- Client Initialization Complete ----- ----- R_Init ----- SDL using driver "mali" Initializing OpenGL display Display aspect: 1.778 ...setting mode -2: 1920 1080 Using 16 color bits, 16 depth, 8 stencil display. Available modes: '1920x1080' GL_RENDERER: Mali-450 MP Initializing OpenGL extensions ...GL_EXT_texture_compression_s3tc not found ...GL_S3_s3tc not found ...using GL_EXT_texture_env_add ...using GL_ARB_multitexture (8 texture units) ...GL_EXT_compiled_vertex_array not found ...GL_EXT_texture_filter_anisotropic not found Initializing Shaders

GL_VENDOR: ARM GL_RENDERER: Mali-450 MP GL_VERSION: OpenGL ES-CM 1.1 GL_EXTENSIONS: GL_OES_byte_coordinates GL_OES_fixed_point GL_OES_single_precision GL_OES_matrix_get GL_OES_read_format GL_OES_compressed_paletted_texture GL_OES_point_size_array GL_OES_point_sprite GL_OES_texture_npot GL_OES_vertex_array_object GL_OES_query_matrix GL_OES_matrix_palette GL_OES_extended_matrix_palette GL_OES_compressed_ETC1_RGB8_texture GL_EXT_compressed_ETC1_RGB8_sub_texture GL_OES_EGL_image GL_OES_draw_texture GL_OES_depth_texture GL_OES_packed_depth_stencil GL_EXT_texture_format_BGRA8888 GL_OES_framebuffer_object GL_OES_stencil8 GL_OES_depth24 GL_ARM_rgba8 GL_OES_EGL_image_external GL_OES_EGL_sync GL_OES_rgb8_rgba8 GL_EXT_multisampled_render_to_texture GL_OES_texture_cube_map GL_EXT_discard_framebuffer GL_EXT_robustness GL_OES_depth_texture_cube_map GL_OES_vertex_half_float GL_KHR_debug GL_OES_mapbuffer GL_MAX_TEXTURE_SIZE: 4096 GL_MAX_TEXTURE_UNITS_ARB: 8

PIXELFORMAT: color(16-bits) Z(16-bit) stencil(8-bits) MODE: -2, 1920 x 1080 fullscreen hz:N/A GAMMA: software w/ 0 overbright bits rendering primitives: single glDrawElements texturemode: GL_LINEAR_MIPMAP_NEAREST picmip: 1 texture bits: 0 multitexture: enabled compiled vertex arrays: disabled texenv add: enabled compressed textures: disabled ----- finished R_Init ----- ------ Initializing Sound ------ SDL_Init( SDL_INIT_AUDIO )... OK SDL audio driver is "alsa". SDL_AudioSpec: Format: AUDIO_S16LSB Freq: 32000 Samples: 512 Channels: 2 Starting SDL audio callback... SDL audio initialized. ----- Sound Info ----- 1 stereo 16384 samples 16 samplebits 1 submission_chunk 32000 speed 0xac94f0a0 dma buffer No background file.

Sound initialization successful.

Sound memory manager started Loading vm file vm/ui.qvm... File "vm/ui.qvm" found at "./baseq3" ...which has vmMagic VM_MAGIC_VER2 Loading 1173 jump table targets Architecture doesn't have a bytecode compiler, using interpreter ui loaded in 2469888 bytes on the hunk 35 arenas parsed 32 bots parsed --- Common Initialization Complete --- IP: 127.0.0.1 IP: 192.168.1.120 IP6: ::1 Opening IP6 socket: [::]:27960 Opening IP socket: 0.0.0.0:27960

I expect that answers most of your questions about platform, but it's a Vero4k, the latest OSMC box with an AMLogic S905x - Cortex-A53 - Mali 450.

Now I need to see how it works...

hissingshark commented 6 years ago

Good work! That was a rush! Plays very nicely. Even tried the online multiplayer.

It doesn't look like there's any native joypad support, so I need to bring out old xboxdrv and see if that works.

Timedemo gave 56FPS at 1080p. If I ran this on a 4K TV I assume there'd be a massive performance hit? Would upscaling from 1080p be possible? Not sure how the modes are implemented.

cdev-tux commented 6 years ago

Excellent work, very nice!

You can change joystick settings in the supplied autoexec.cfg file.

I think you can run the game at 1080P and your TV should upscale it, but I’m not sure about that. I would be interested to know if you’re getting the Alt/Ctrl key bug on that platform.

After you have time to work out all of the kinks, feel free to submit patches and I’ll try to add native support for your platform to Q3lite, and give you credit.

Thanks

cdev-tux commented 6 years ago

I just noticed that you said ‘joypad’ and not ‘joystick’. I seem to remember seeing joypad references in the source code, so I think it’s possible to get it working.

Q3lite runs at ~97 fps on a Pi 3 at 1080P, so I think that the performance can be improved. I remember reading that enabling neon instructions at compile time can slow things down in some cases due to overhead in implementing those instructions, so you may be able to adjust the compiler settings to speed things up. Can’t wait to try this on my Odroid XU4!

Thanks again for your work on this.

hissingshark commented 6 years ago

If I understood your NEON remarks correctly I tried changing out my -mfpu=neon-fp-armv8 for the -mfpu=vfpv4 used on the RPi3, but still sitting at about 55fps.

Then I noticed in my above console output:

Architecture doesn't have a bytecode compiler, using interpreter

Aha I thought! That sounds slow, and spent all day trying to get it to build with the vm_armv7l.o. Success at last, but the fps hasn't budged.

I've built and run it on the RPi3 using the supplied script and that only comes out at 64fps. How odd. Some common factor my account for my results? But of course raises more questions...

FYI I'm benchmarking by starting Quake from the commandline using: ./quake3.armv7l timedemo 1 +demo four

hissingshark commented 6 years ago

Oh and I've got the same Alt/Ctrl bug as you, disabling the internal console.

hissingshark commented 6 years ago

Odd thing. I upped the Setup-System->Graphics :

Lighting: lightmap -> vertex Texture Detail: 75->100% Texture Quality: 16 -> 32bit Texture Filter: bilinear -> trilinear.

And my fps went UP to 59 fps.

Actually Lightmap is better than vertex. Reverting that but keeping all the others at max dropped me to 53 fps. But it does look way better.

cdev-tux commented 6 years ago

You correctly understood my neon comment. I’m not sure what the correct –mfpu setting is for your processor, but it might be worth investigating.

As you probably determined, the BUILD_GAME_SO=1 setting in make-raspberrypi.sh is needed to have the shared object files compiled. To get Quake3 to use them instead of qvm’s requires the +set vm_ui 1 and +set vm_cgame 1 command line settings, and +set vm_game 1 if your running a server. You can also try setting those values to 0. I remember getting a 2 or 3 fps boost with the .so files on a Pi 3.

You can probably gain some fps by testing the various video settings in the autoexec.cfg file like you have been. The settings in that file have been optimized for the Pi, but may not be optimal for the mali.

Glad to see that you have it working; have fun with it.

hissingshark commented 6 years ago

tl:dr I cannot breach 59.9 fps, even with textures turned off! It seems to me there's an imposed 60 fps limit. A coincidence since the TV refresh rate is at 60Hz?

Very much having fun, thank you. I've started to look into the gamepad situation, but that's going to be a slow burner. Particularly as I'm a bit concerned that playing with a gamepad will be so much less accurate/responsive than a mouse that it'll be unplayable against anyone without the same handicap. I assume gameplay in the Dreamcast port was modified to compensate, maybe a degree of auto-aim, but doing that here would qualify as a cheat/hack, so preventing online play. The best I could do is to look at response curves for the analogue stick, rather than a linear response.

Regarding performance, I don't think you'll ever notice the lower than RPi3 fps as 59 is still excellent. But on principle it should be acheivable. I tried mfpu=fp-armv8 as thats non-NEON for this processor but no change.

I'd overlooked some of the commandline options used on the RPi as well, no doubt reverting to defaults. My board has 2Gb of RAM so I tried the RPi3 settings and then upped gradually to the following, but it too had no effect. Strange?

./quake3.armv7l +set com_hunkMegs 512 +set com_zoneMegs 128 +set com_soundMegs 32 +set timedemo 1 +demo four

Now tried the .so options you mention, but no effect.

Editing autoexec.cfg I've managed to drop the fps with only a few options, made it look prettier with others - but I cannot breach 59.9 fps, even with textures turned off! It seems to me there's an imposed 60 fps limit. A coincidence since the TV refresh rate is at 60Hz?

cdev-tux commented 6 years ago

Try setting +set r_displayrefresh 125 on the command line and see if that helps. Also, check to see if you can turn off Vertical Sync (vsync) on your platform. Here’s a link to the command line settings used with the Q3lite startup script. (Scroll right to see all of the settings).

https://github.com/cdev-tux/q3lite/blob/dev/misc/q3lite/pi/q3lite#L41

The fps goal that many people shoot for is 125 fps, because you can jump a little bit higher at that fps. But to maintain that fps on larger maps requires a much higher fps. At 720P on a Pi 3 I get ~150 fps on timedemo four, but gameplay still drops below 125 fps on large maps with a lot of action going on. So for the Pi, I think it will require around 170 to 180 fps on timedemo four to maintain 125 fps during gameplay.

I hope the above settings work, let me know how it goes.

Thanks

hissingshark commented 6 years ago

You were right! I've today had it confirmed that vsync is presently locked down on this platform, hence the 60Hz cap. There might be a patch to make this switchable in due course.