ptitSeb / gl4es

GL4ES is a OpenGL 2.1/1.5 to GL ES 2.0/1.1 translation library, with support for Pandora, ODroid, OrangePI, CHIP, Raspberry PI, Android, Emscripten and AmigaOS4.
http://ptitseb.github.io/gl4es/
MIT License
704 stars 159 forks source link

amigaos4: first public release #76

Closed kas1e closed 5 years ago

kas1e commented 5 years ago

Howdy !

So want to made a first public release of gl4es for amigaos4 , and want to ask you before i will do so, can you fix that latest issue with CMAKE, when we have "-g" added in any case , i.e. when i use "CMAKE_BUILD_TYPE" variable, then any of "Release", "Debug", "RelWithDebInfo" or "MinSizeRel" forced to have "-g" in any case, while probabaly "Release" at least should't have it (i just want to write little notes about how to build it for amigaos4 for everyone who will have interest in, so will be good when all will works without needs to touch makefiles, etc).

Also, do you have any objection if i will add also your libglu to the same package (reasson is that we have glu for minigl, and to not have mix it all up, and have your glu with gl4es at one time).

Once i will prepare archive,i will of course firstly put it there so you can check if all allright, etc :)

ptitSeb commented 5 years ago

Ok, done for the CMakeList.

For libGLU, I have nothing to do on my side, right? It's just the packaging, that will contain both libGLU and libGL?

kas1e commented 5 years ago

Yep, for libglu nothing to do as far as i remember (need to check thourhg if there is no -g for release as well).

I also do not know if it call libGL.a or libgl4es.a ,as we have libGL.a in SDK already (for minigl). For myself in all gl4es based projects i named it as libgl4es.a , and include directory not "GL", but "GL4ES", so to avoid conflicts with minigl

kas1e commented 5 years ago

By the way, but shouldn't "Debug" version have "-g" added ? Or "Debug" there mean internal debugging of gl4es ?

ptitSeb commented 5 years ago

libGLU doesn't explicitely include any libGL.a or libgl4es.a... So you can name it as you want.

The "-g" in Debug build is added by cmake itself (also in "RelWithDebInfo" build type).

kas1e commented 5 years ago

@ptitSeb I made public releases of Prototype, Barony and Fricking Shark. Posted the news items on 2 of our popular forums, on one its already appears. All seems love it all, but have some issues with Barony, but that one related to our internal problems (like maybe i use too new newlib.library for), i will deal with.

Also probabaly (i hope), there should be some donations done for you from other os4 users, let me know plz if anyone will do so.

I also made 3 videos for those 3 games of how they all runs, probabaly you will be in interst to watch them out:

prototype: https://www.youtube.com/watch?v=K4Ubgap_nrA&t=185s barony: https://www.youtube.com/watch?v=mFztSoe1ysw fricking shark: https://www.youtube.com/watch?v=eA9txj_P38I&t=38s

Once will deal with Barony problems and made an update, will start to prepare gl4es release as well.

Thanks a bunch !

ptitSeb commented 5 years ago

Nice! Nice videos. to bad for Barony crash... Tell me if I can help.

kas1e commented 5 years ago

Found today new bug, which probabaly will be on our side (or maybe not, dunno). The issue is that when i run app builded with sdl2/gl4es in fullscreen mode, and then press "alt+enter" to switch to the window mode , i have black window and crash, and crashlog point out on aglSwapBuffers(). The same app building with the same SDL2 , but with minigl , didn't crashes and works as expected.

From another side, if i run app in the window mode, and then switch to fullscreen, then i can switch back to window mode without crash. But after 10-20 switches it still crashes sometime. Strange, but maybe just some luck. But when i run it in fullscreen by defauilt, it 100% reproducable crash when switch to window mode.

Question is how to detect what happens and why .. Can it be amiga_pre_swap() / amiga_post_swap() ?

I tried few games, all behave like this, even quake3.

ptitSeb commented 5 years ago

Maybe yes. I'm unsure what is hapening, but call pre/post swap with context deleted will probably not work well. If you are using aglSwapBuffers(...) function, you can add a test to see if context is valid: add

if(!agl_current_ctx) return;

line 136 of the function, at the very beggining. Maybe it's just that (but why calling swapbuffer with no context?)

kas1e commented 5 years ago

Nope, didn't help :( Its like context looses , as i can see how windows spawns (with black color) and then bah, crash

kas1e commented 5 years ago

Oh ! False alarm ! It crashes like this with pure SDL2 / testgles2.c example ! So even without gl4es being involved. Sorry for, will report it to mantainers.

kas1e commented 5 years ago

By the way, did you tried to use -flto (that link time optimisation) introduced lately in gcc in your projects, and for example with gl4es ? It can speed things up a bit. I am build today gcc8.2.0 with it, will try to build something heavy like barony or fricking shark with it, to see if it will help in any way. Maybe worth to try gl4es as well ?

ptitSeb commented 5 years ago

Yeah, I already tried LTO. You can gain some perf (and binary size) on large project, especially with c++ (use -flto-odr-type-merging then). There zill be not much gain gl4es, and because it makes compilation a bit longer (individual files build faster, but final link is longer, and may need swapping on the Pandora that only have 512MB of RAM), I don't use it often...

kas1e commented 5 years ago

As i read in google, when you build gcc with --enable-lto, then -flto-odr-type-merging is the default (so no need to specify it). Dunno how right is it through..

kas1e commented 5 years ago

But seems not on my gcc, i just checked with "-v" when build it with -flto, and can't see that -lto-odr-type-merging added, so probably you right, that need to be added manually. Is it something i should use with -flto all the time when use c++ code ?

ptitSeb commented 5 years ago

I qdd odr-type-merging when I build c++ stuff, but I don't know if it's needed or if it help. On some c++ games, I can have up to 10% more perf, but on most games, I don't feel the difference...

kas1e commented 5 years ago

I rebuild foobillard++ with -flto for objects and final linking line (without odr-type-merging), and have about 8-9fps increase !! I.e. i keep same gl4es, sdl2, glu, just only add it for foobillard objects. And so, for the same tests, i have without lto : max: 64,1, avg: 60,19 and with lto max: 72.7, avg: 68.57. Quite cool.

Will try now with odr-type-merging , as well as will try to rebulid Barony, FrickingShark, Prototype, NeverBall/NeverPutt and Q3

ptitSeb commented 5 years ago

Mmm, because you have a static build of gl4es, yes, LTO can bring some further gain by optimising drawing command inside the logic functions :)

kas1e commented 5 years ago

Do you think -flto optimize also everything from the .a libs (i.e. all objects from sdl,gl4es, glu, etc) ? I somehow think before it can't touch .a (even read somewhere about). But maybe it was before like this, and now it can extract objects from .a and also optimize them.

ptitSeb commented 5 years ago

It's better to not use lto on dynamic library. So SDL2 no. But because gl4es and GLU are static libs in your case, yes, LTO will optimize all object from them too.

kas1e commented 5 years ago

In my case SDL2 also static too :)

kas1e commented 5 years ago

And SDL_Mixer2 , and every library i add are static on me, final binary didn't contain any dynamic sections, maybe that explain that huge boost for about 13% ?

ptitSeb commented 5 years ago

Yes, LTO have plenty of opportunity to optimize!

kas1e commented 5 years ago

Tried recompile some more stuff, results are:

Quake3 (plain C): give +1.5fps everywhere with lto. Foobillard++:(plain C) as i say before give +8fps (about 12-13% speed increase)

FrickingShark (c++/c): no changes Barony (c++/c): no changes Prototype (c++/c): no changes

That with pure -flto.

Looks quite strange : 2 plain give perfomance, c++ ones didn't. Maybe worth to try that odr-type-merging stuff ?

ptitSeb commented 5 years ago

Well, for Prototype and frikingShrak, I'm expecting both of them to be GPU Limited, so I'm not surprised you don't get any gain. For Barony I'm more suprised, as this one should be CPU limited and should benefit from any code optimisation... Yes, try with -fodr-type-merging...

kas1e commented 5 years ago

Tried to use together -flto -flto-odr-type-merging , and separately only -flto-odr-type-merging , final result : the same size of binary, and the same 1:1 fps. Very strange indeed.

And, in compare with the binary size without lto, its difference just for about 2kb. For foobillard++ difference in size for about less on 0.5 MB for lto version.

ptitSeb commented 5 years ago

Maybe you can try to change the parition size for that one, as a last try: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html use -flto-partition=one or none? (but be warned, it will use a lot of memory then)

kas1e commented 5 years ago

Do you mean lot of memory when build, or lot of memory when it will be used when builded ?

ptitSeb commented 5 years ago

Only when building.

kas1e commented 5 years ago

So you mean use 3 options at one time : -flto -flto-odr-type-merging and -flto-partition=one ? Will try now

ptitSeb commented 5 years ago

yes.

kas1e commented 5 years ago

Size of binary for sure changed to better now, its saves about 400kb now, will check fps now

kas1e commented 5 years ago

From fps side no differences ..

ptitSeb commented 5 years ago

Bah, at least it's smaller...

kas1e commented 5 years ago

Btw, always forgot to ask, is those includes come with gl4es (include/GL/ ones at least) are up2date and fine to be used with ? I mean there probably no needs to update them to latest mesa ones , etc ?

ptitSeb commented 5 years ago

They are fine. It's not the latest version (like there is no OpenGL 4.3-4.5 functions iirc), but it's more then enough for anything current.

kas1e commented 5 years ago

Hi!

One of our amigaos4 users ask me " Will GL4ES support ARB shaders, or does it already do?" to which i cant answer. Can you clear this up ? Thanks!

ptitSeb commented 5 years ago

Not yet. But I do plan to implement some kind of support for this.

kas1e commented 5 years ago

By the way, is there any ability via environments to print shaders in use ? I.e. not failing ones, but all of them. And if no, did you think it worth of implementing something like:

LIBGL_LOGSHADER : print to console shaders which was created internally via gl4es and will be in use.

0 - default, no shader logging 1 - do only vertex shaders logging 2 - do only fragment shaders logging 3 - do loggin of both, vertex and fragment ones

By this way, it will be easer to debug things. For example, for now i need to know what shaders was created for some game (to catch some bug), but i just don't remember where and how we prinfs things for it :(

ptitSeb commented 5 years ago

That could be done, but would it be the shader before or after they are transformed to GLES2 version?

Also, if you just disable the line 11 of src/gl/shaderconv.c you have just that. It's less convenient than a env. var. of course, but that a start.

kas1e commented 5 years ago

but would it be the shader before or after they are transformed to GLES2 version?

I mostly mean about those shaders which generated by gl4es (i.e. not those ones which generated from games coming with them).. But that also can be in the ENV's options.. Maybe something like LIBGL_LOGSHADER_INTERNAL and LIBGL_LOGSHADER_REGENERATED with same set of options, or something of that sort ?

ptitSeb commented 5 years ago

I'll see what I can do (I don't want to slow things down for an option not used often, but something can probably be done).

kas1e commented 5 years ago

no,no i am up for speed of course ! If it will slow anything down at all , then better forget it of course :)

ptitSeb commented 5 years ago

I have just pushed a new env. var.: LIBGL_DBGSHADERCONV. When set to 1, it will dump to the console the whole shaders, before and after they are converted by shaderconv. In your specific case, it's maynly the "after" that you're interrested in. Also, it's very basic, no Fragment/Vertex selection, no Before/After... That could be done (with some mask, like Vertex=1, Fragment=2, Before=4 and After=8), but I'm a bit lazy for now. Maybe I'll do it later.

kas1e commented 5 years ago

Oh , thanks a bunch, will test it now. And yeah, adding masking for sure will be cool. That should not slow things down and output shaders can be controlled fully.

Did i understand right, that when we use pure opengl code (without opengl shaders), then it mean, that we don't have any "before" shader, we just have only "after". I mean, there is only needs to make shaders one time in case when there is no opengl ones coming with games ?

Btw, can you also add to the sec/agl/alg.c that at the end of file:

void *aglSetParams2(struct TagItem * tags) {
    if(IOGLES2) {
        IOGLES2->aglSetParams2(tags);
    }
} 

As well as

void *aglSetParams2(struct TagItem * tags);

to src/agl/agl.h

That needs it as we make use of it in latest SDL's (to fix some things). Before we don't add it as have no needs, but now we need it, so i all the time add it manually since last few weeks :)

Thanks !

kas1e commented 5 years ago

Also some little nitpicking : probabaly in USAGE,md should be not just:

0: Don't log anything

but

0: Default, don't log anything

ptitSeb commented 5 years ago

Yes, you are right. I'll make the changes. About the shaders, there is always a "before", but that can be some shader generated by FPE (i.e. Fixed Pipeline Emulator). I'll see to implement the masking...

ptitSeb commented 5 years ago

@kas1e I assume it's void aglSetParams2(struct TagItem * tags); without the *, right? It's a void function, not a function that return a pointer?

kas1e commented 5 years ago

Nope, as i show its correct one , the same as aglCreateContext2() and aglGetProcAddress().

In includes it the same as above functinos:

void APICALL (*aglSetParams2)(struct OGLES2IFace *Self, struct TagItem * tags); 
kas1e commented 5 years ago

That one return a pointer to a valid ogles2 window/context when need it, etc , so return pointer

ptitSeb commented 5 years ago

That's a function pointer notation. The return type is void