ptitSeb / gl4es

GL4ES is a OpenGL 2.1/1.5 to GL ES 2.0/1.1 translation library, with support for Pandora, ODroid, OrangePI, CHIP, Raspberry PI, Android, Emscripten and AmigaOS4.
http://ptitseb.github.io/gl4es/
MIT License
698 stars 159 forks source link

foobillard++ port #87

Closed kas1e closed 5 years ago

kas1e commented 5 years ago

As we discuss in another topic, there better to made a separate ticket for.

So .. While porting this to amigaos4, found it have some bugs, which probabaly will be good to fix not only for amigaos4 port, but also for Pandora's one. While topic created in the gl4es repo, at moment really one issue which need to be deal with, its adding opengl1.x antialisaing, so antialing in foobillard++ will works.

But , there is some other problems in the foobillard++ code, which trash memory , and heap as well.

One of issues we deal with it was in sound_stuff.c, in the strsound() function, where originally was strcpy(s,&s[strlen(s)-4]); , which should be strcpy(s,&s1[strlen(s1)-4]); or it will trash memory heavy. On amigaos4 it just cause crash, strange why it didn't crashes on Pandora and/or Win32..

Another issues, which also lead to crash, its seems some heavy heap corruption. Crash happens when we navigate over the menu items, and then it crashes in the function create_string_quad() from texobj.c , on the free(texdata) call. Adding pure if(texdata) free(texdata) check didn't help, which mean data is here always , just heap corrupted much so we crash. Issue present and on stable-soruce build, and on the latest code from repo.

So i then add -O3 and -Wall to see, if compile brings any errors , and yes, it bring when trying to build menu.c file:

menu.c: In function ‘menu_entry_set_settingtext’: menu.c:273:28: warning: ‘ : ’ directive writing 3 bytes into a region of size between 1 and 256 [-Wformat-overflow=] sprintf(str,"%s : %s",entry->text,entry->settingtext); ^~~ menu.c:273:13: note: ‘sprintf’ output between 4 and 514 bytes into a destination of size 256 sprintf(str,"%s : %s",entry->text,entry->settingtext); ^~~~~~~~~~~~~ menu.c: In function ‘menu_create_textobj’: menu.c:292:28: warning: ‘ : ’ directive writing 3 bytes into a region of size between 1 and 256 [-Wformat-overflow=] sprintf(str,"%s : %s",entry->text,entry->settingtext); ^~~ menu.c:292:13: note: ‘sprintf’ output between 4 and 514 bytes into a destination of size 256 sprintf(str,"%s : %s",entry->text,entry->settingtext); ^~~~~~~~~~~~~ menu.c: In function ‘menu_choose’: menu.c:273:28: warning: ‘ : ’ directive writing 3 bytes into a region of size between 1 and 256 [-Wformat-overflow=] sprintf(str,"%s : %s",entry->text,entry->settingtext); ^~~ menu.c:273:13: note: ‘sprintf’ output between 4 and 514 bytes into a destination of size 256 sprintf(str,"%s : %s",entry->text,entry->settingtext); ^~~~~~~~~~~~~

That looks very much as overflow come from there, which probabaly guilty for corrupt heap in end.

While simple changes of char str[256]; in those functions from warnings come on char str[1024]; deal with errors, it's probabaly just hide the real overflow problems, which in end cause heap corruption and because of which we crash later when do free.

I also can see this kind of warnings in the language.c and history.c files, but they of no use when i navigate over menu, so probabaly only those from menu.c relevant to the heap corruption. At least, while i didn't go to the history/language sections of menu.

Dunno how to detect from where it come through, on amigaos4 we have almost no tools to debug such things :(

ptitSeb commented 5 years ago

Ah, no. I retested, same scenario as in your video, but I don't have the "blinking hole" issue. Note that I have a vanilla version of foobillardplus here (so with the crash when playing a bit too much with menu). I used latest gl4es, everything is fine here.

kas1e commented 5 years ago

Strange.. Well i think we can go that route: i will create on my github foobillard repo with 3.42a stable src, and apply fixes all one by one. Then you can simple build it from there so we can be sure all the same for both if us.

Btw, i also made it works for SDL2, so all of this will worth of porting it to Pandora too

ptitSeb commented 5 years ago

Ok, I can indeed use your source to try reproduce the issue first.

kas1e commented 5 years ago

That was little pain, but found that this last issue was because of some ogles2.library test versions. But funny it's disappear when i set to disable merger :)

Anyway, i still create repo for foobillard: https://github.com/kas1e/FoobillardPlusPlus

There i put stable 3.42beta sources, add sdl2 support (use it with --enable-sdl2 switch, all changes done in the sys_stuff.c) , and add 4 fixes for the original game code bugs (without ifdefs as they general).

With latest commit i retested by redownloading all from repo and rebuild from scratch : all seems to works.

ptitSeb commented 5 years ago

Ah ok, OGLES2 bug this time.

Thanks for the repo, I'll look at this later.

kas1e commented 5 years ago

By the way, did you in interest to fix issues when they happens in foobillard with disabled glbegin/glend merger ? Not that its important for foobillard itself, but those visuall glitches may point out on something which worth to fix for sake of bug-hunting of gl4es in general.

ptitSeb commented 5 years ago

Sure, if I can reproduce it, I'll fix it.

kas1e commented 5 years ago

Ok there is new video, where i setenv LIBGL_BEGINEND to 0 , and run foobillard++:

https://www.youtube.com/watch?v=t7cWfzCmrMM

I can see there 3 bugs (but very well it all can be about 1 issue).

1 : some random lines in all the ways from the balls sometime. You can see them from 15 to 20 seconds in video, and that very well visibly starting from 50 seconds of video till end.

2: there is missed polygon in the right middle pocket, you can see it in video on the starting from 20 seconds till 35.

3: There is some fast blinks of some borders of table sometime (i assume its light blinking). Can be well visibly starting from second 35 to 45. But it can be seen all over video from time to time (somewhere more, somewhere less).

Very possible its all just same issue.

ptitSeb commented 5 years ago

Ok, thanks for the video, I'll take a look later, see if I can reproduce the issues.

ptitSeb commented 5 years ago

I have finally tested, sorry for the long delay (very focussed on my other project currently). I don't reproduce the issue here with LIBGL_BEGINEND=0. Maybe it's not a GL4ES issue but some optimisation on OGLES2 drivers, that also tries to merge subsequents similar call, like GL4ES is doing?

kas1e commented 5 years ago

Damn.. None of those 3 issues ?

Btw, yeah, i see you made almost no commits to gl4es lately, kind of unusual :) so probably some really big project going on ? Possible to port to os4 ?:))

ptitSeb commented 5 years ago

Yeah, it's a big project (my biggest I would say), but I'm afraid it will not be possible to port on Amiga (it needs Little-endian, and Linux based system to works).

And no, none of the 3 issues. It was just working the same way with or without BEGINEND merger (and I double checked to be sure it was really deactivated).

kas1e commented 5 years ago

Oh, i forget the most important thing : did you test foobillard++ from my repo , right ? Not your one from SVN ? Or you use that latest stable source ? In any case, can you try my one from repo as well ? (it also SDL2, but sure that can't be cause of it). Just to be sure that we can rule out gl4es.

ptitSeb commented 5 years ago

Yes, it was the version from four repo.

kas1e commented 5 years ago

Ok so.. time for ogles2/warp3dnova reports then , thanks !

Btw, doesn't you mind if i will create another ticket, with that issue we have on aos4 in fricking shark when press f4 and have crash ? Just Daniel and Hans keep saying that this time its probabaly gl4es, while i somehow think it is not, but maybe if i will collect all the necessary info in one ticket, you may have on or two ideas later which i can bring on them :)

ptitSeb commented 5 years ago

You can create a ticket, not sure how I can dig more on this one.

kas1e commented 5 years ago

Btw, just to make you know, the issue 1 (those random blinking lines) was fixed with another ogles2.library update, it was some bug introduced in optimizer lately, which just was obscure enough to no meet with it in another games.

ptitSeb commented 5 years ago

Ah good to know. Thanks for the info. And the other 2 bugs? Still there?

kas1e commented 5 years ago

Sooo... The others 2 bugs was fixed lately as well. One about strange lines was fixed in the previous version with remark "bug introduced with 2.0, one of the new optimized buffer-hashers had a bug which could lead to all sorts of pretty weird behaviour, like geometry appearing at wrong positions or whatever. The appearance of this bug also depended on your data buffers' alignment and size, so generally spoken it only rarely happened, which is why it slipped through."

And other bug (with some broken part of pocket) was fixed by that last bug about which we worry in Lugaru's thread, with remark "- Critical fix: one of the 32bit-step-hashers had a problem with small data-packets; it would eventually produce the same hash for similar data, which would result in geometry becoming distorted or being falsely positioned, stuff like that. Actually the symptoms were pretty much like those of the bug fixed in the previous version 2.2."

Soo, foobillard++ ticket can be closed now, yeah !

kas1e commented 5 years ago

Can't find where we discuss it, but probable there : if you remember in the r170 of foobillard++, i told before that on running we have on serial many:

WARP3D_SI.library: ERROR: Interleaved arrays with different strides detected in VBO 0x00000000

Now, i recieved new warp3dnova beta build, where Hans told it was wrong error report, and so he fix it. And what we have in output now, is :

W3DN_SI.library (0): ERROR: Interleaved arrays with different strides detected in VBO 0x65F69630. W3DN_SI.library (0): ERROR: Interleaved arrays with different strides detected in VBO 0x60CDD230. W3DN_SI.library (0): ERROR: Interleaved arrays with different strides detected in VBO 0x60D05930. W3DN_SI.library (0): ERROR: Interleaved arrays with different strides detected in VBO 0x65F64830. W3DN_SI.library (0): ERROR: Interleaved arrays with different strides detected in VBO 0x60CFC430. W3DN_SI.library (0): ERROR: Interleaved arrays with different strides detected in VBO 0x60CE6830. W3DN_SI.library (0): ERROR: Interleaved arrays with different strides detected in VBO 0x6338FD30.

On what Hans says that probably i really do have interleaved arrays that collide with one-another. I think it can be game itself as well, but if i remember right, on Pandora you didn't catch such stuff when test r170 ? From other side if you specially didn't printf it, maybe those errors here ..

Just to know if it game or not, because if it game, let it be, but maybe it again ogles2 , or gl4es or something..

ptitSeb commented 5 years ago

I don't think I can generate "colliding interleaved arrays" in gl4es (but a program can certainly do that. but not foobillard++, I don't think it use any interleaved arrays). Only place you'll see interleaved arrays in gl4es is with the "merger", but there will be only 1 set of arrays (up to vertex+color+tex1+tex2+normal interleaved) at a time. Anyway, you can see what is happening by activating "DEBUG" trace in src/gl/fpe.c : that way, you see what are the parameters of VA. Also, remember that gl4es doesn't create any VBO. Everything happens on "default" VBO (i.e. VBO 0). Still you can check on the debug output if there are some colliding arrays (but this kind of check is difficult to do by hand, really...) More info on the collision could be helpfull (like address of the 1st colliction?). Anyway, those VBO address are probably handled by OGLES2 driver, not gl4es.

kas1e commented 5 years ago

Yeah, will check now. But it happens only with foobillard, and only with r170 (i.e. with code where they get rid of glBegin/glEnd route in favor of another one, which in end may produce those interleaved arrays probabaly ?)

But will check with fpe.c debug firstly

ptitSeb commented 5 years ago

Ah, in that case, yes, it could be foobillard itself! gl4es doesn't check this thing at all. And only on Amiga, where you have to convert data from bigendian to littleendian, have overlapping arrays is an issue! But yes, it maybe some mistakes with foobillard++ new array code.

kas1e commented 5 years ago

Probably if i will build r170 with DEBUG in fpe.c it will show if it coming from foobillard ? (i just by some unluck deleted r170 code , need to report it again with all those PROGDIRs to see debug output)

ptitSeb commented 5 years ago

Well, gl4es will show how different VA are setup'd, but it will not warn if a VA overlap some other one.

kas1e commented 5 years ago

Hi ! Found some interesting moments, want to be sure those is not bugs or bugs which need to be fixed :)

So issue is, when i set different levels of LIBGL_MIPMAP in foobillard++ , i have for some values pretty weird results. For example LIBGL_MIPMAP 3 , give some strage "gaus/blure" look of whole image.

Check this out when MIPMAP is 0: http://kas1e.mikendezign.com/aos4/gl4es/games/foobillardplusplus/foo_mipmap0.jpg

And that when MIPMAP is 3: http://kas1e.mikendezign.com/aos4/gl4es/games/foobillardplusplus/foo_mipmap3.jpg

See table is full of lines, and walls also change a bit their look.

Interesting is it should be like this, or that bug to be fixed, and if its bug, then where :)

ptitSeb commented 5 years ago

Yes, that's a normal effect. With LIBGL_MIPMAP=3 you disabled MipMap use... Because the green par of the table (and probably the wall too) have a complex effect, with a "detail texture" effect add the the simple green texture. Look at this zoomed in vue: foobillard It's from foobillard (not ++), be that rendering part is the same. You see small detail texture here. This texture have mipmap on normal use, so zoomed-out vue gives a correct picture. Without MipMap, you get this kind of effect. Best is to look at the wiki of mipamp https://en.wikipedia.org/wiki/Mipmap to have some good explanation of the effect and why mipmap fixe the issue with repeated patern on large surface...

kas1e commented 5 years ago

Aha ok, no bug then, thanks!