Closed kas1e closed 6 years ago
Strange, crash still here. And it didn't seems like crashes in that function, it crashes later, in the game.
All i can see, is that window showups, then closes immediately, and crash come just once first gl function in use (in my case it was glViewPort() , but that probably doen't matter, as window just closes).
Probably those "varargs" can be issues. As i see, in ogles2 it defines like this:
And in interface like this:
void VARARGS68K APICALL (aglCreateContextTags)(struct OGLES2IFace Self, ULONG errcode, ...);
Those VARARGS68k and __VA_ARGS__ can be different than usuall stdarg.h ones, but can be wrong..
Yes, maybe.
This code is only for Amiga, so I can adapt. What is the correct way to use vaarg on amigaos4?
I always have hard times with that VARARGs stuff and necessarity of their use.. But i found that: http://www.os4coding.net/forum/varargs68k
Ah ok, I'll adapt code tonight then.
Thanks a bunch !
So, if you replace the code by this one:
void* VARARGS68K aglCreateContextTags(ULONG * errcode, ...) {
void* ret = NULL;
if(IOGLES2) {
va_list args;
va_startlinear(args,errcode);
ret = IOGLES2->aglCreateContextTags(errcode, va_getlinearva(args, struct TagItem *));
va_end(args);
}
return ret;
}
Does it compile (and if yes, does it run)?
Also, I'm unsure what header is needed for those. Does #include <amiga_compiler.h>
is needed and is enough (to replace the #include <stdarg.h>
)
If i use the same
src/agl/agl.c: In function 'aglCreateContextTags': src/agl/agl.c:104: error: 'va_list' undeclared (first use in this function) src/agl/agl.c:104: error: (Each undeclared identifier is reported only once src/agl/agl.c:104: error: for each function it appears in.) src/agl/agl.c:104: error: expected ';' before 'args' src/agl/agl.c:105: warning: implicit declaration of function 'va_start' src/agl/agl.c:105: error: 'args' undeclared (first use in this function) src/agl/agl.c:107: warning: implicit declaration of function 'va_end'
Mmm, then I don't know. I don't have an AmigaOS to experiment on my side :(
I'm afraid you need to ask help of other for this issue.
Why i start to worry about it, it just because of some experiment, to see, if it will squahs some issue i have now.
Issue is some very strange one, and looks like some memory trashing coming from gl4es (or , the way how it added to SDL1 for os4 side).
Issue its quite strange. Check this code: https://github.com/kas1e/SDL/blob/SDL-1.2gl4es/src/video/amigaos4/SDL_os4gl.c
You can see there how currently it all done for SDL1 / GL4ES. Then, once i build Cadog game with that line : dprintf("Initializing GL4ES->OGLES2..\n"); (right before context creation). I didn't have in Cadog title picture at start ! (quite strange).
But once i comment out that prinfs, and build Cadog with it, then TitlePic from cadog going back !
That all point me on some memory trashing issues , and so i start to experiment : thinking that maybe because i call create of context from IOGLES2 in SDL, while opening of library itself happens in GL4ES and it maybe somehow "not shared enough" beetween or something..
That why i think that "maybe trying to swap it on createcontext from agl.c , just to see if there will be differences", and so i see it crashes..
Is it the same dprintf
as in Linux? Because in that case you need a file descriptor: http://manpagesfr.free.fr/man/man3/dprintf.3.html and ineed as-is, it will not run.
Other then that, I don't see anything wrong in the code. I'll look at it this week end (won't have much time tonight and tomorrow).
Its not necessary dprintf should be there. If i even put there pure printf("aaaa\n"); right before context creation, then i have no title pic in Cadog. Once i comment it out, title pic is back :)
I do some more tests, and found, that if i do :
printf("a\n"); or printf("aa\n"); or printf("aaa\n"); : title pic still here. But once i do more than 3 "aaa", i.e. even just printf("aaaa\n"); : then no title pic.
That imho cleary point on memory trashing ?
Yeah, could be.
Can you use the other function to create context, the one without the VAARG stuff? something like
struct TagItems tags[] = {
OGLES2_CCT_WINDOW,(ULONG)hidden->win,
OGLES2_CCT_DEPTH,16,
OGLES2_CCT_STENCIL,8,
OGLES2_CCT_VSYNC,0,
OGLES2_CCT_SINGLE_GET_ERROR_MODE,1,
OGLES2_CCT_RESIZE_VIEWPORT, TRUE,
TAG_DONE, TAG_DONE };
hidden->IGL=IOGLES2->aglCreateContext(0, tags);
should work I guess.
You mean call from SDL still as IOGLES2->, or as one from agl.c ? But i will try both ways anyway
Tried both variant:
printf("aaaa"); hidden->IGL=IOGLES2->aglCreateContext(0, tags);
In that case in Cadog i have no background picture.
Then tried:
printf("aaaa"); hidden->IGL=aglCreateContext(0, tags);
In that case, cadog background picture there !
Then, i build letter's fall with new (working) variant. And trashing of menus (if you remember i show some screenshots before), almost gone ! They still here , but surely change the way how it all looks like.
That can only mean, that with new way of calling creating of context, we just a bit "shift" memory trashing issue , and that one probably come from gl4es , or, the way how we add aos4 backend inside of gl4es.
That prove the point about which Daniel told me before : he check a lot gles2 library, and all the time he come to consclusion , that there is memory trashing somewhere, which cause those effects with q3 and with irrlich engine (and with letters fall). Just with our fix , we only shift the issue.
Probably it hides somewhere in the amigaos.c or agl.c or in any other #ifdef amigaos4 place ... Uhm, quite strange !
Maybe it can be some names conflicts, like we have aglFunctions , 1:1 the same named as those ones we call from IOGLES2->. Maybe it worth to change them all inside agl.c on something like gl4es_aglCreateContextTags, gl4es_aglSwapBuffers, etc ? By this way it will be undestadable that those ones are gl4es ones, and for sure will not inherit with IOGLES2 ones.
Probably that not the case anyway for memory trashing issues, but still will looks better.
I don't understand why you think this test proves memory corruption comes from gl4es (and I'm pretty sure it doesn't). The aglCreateContext is not really a gl4es function. It just wrap the call the the actual agl function for OGLES2.
And I don't think there is a name conflict here. It would just not work with a name conflict, plus there is no conflict because function from OGLES2 are from a structure and not gl4es ones.
Now, if you are still unsure of the conflicting name, simply don't build agl.c
and only use OGLES2 functions (for creation of context and swapbuffer) for testing...
If you want to be sure noting of gl4es is loaded at start, rebuild gl/src/init.c
with -DNO_INIT_CONSTRUCTOR
and call initialize_gl4es()
before using it (so after the context creation). You will probably need to declare that function with extern void initialize_gl4es()
or extern "C" void initialize_gl4es()
if you are in a cpp file.
By name conflict i mean and visual (when one read code, he will think its real functions, as one of our devs today), but also it can be conflict, when one will use useinline directive, which we have in our SDK, and which allow us to use functions without needs to write interface name. Ie pure aglCreateContext can work as it from ogles2 if anyone, anywhere will set use_inline__.
Sure we can call it not gl4es_aglCreatingContext, but gl4eshelper_aglCreateContext or anything else just not the same names as originals as it can lead to problems later.
As for memory corruption: at moment i come to that as Daniel spend some days to try to catch issue with q3, and say that it some undefined behaviour which trash memiory , and even if we think that our workaround with that normalisation problem deal with, it just shift issue somewhere else.
Then i found that issue with Cadog and printfs before context creating, which point on memory trashing as well.
Also lettersfall game have trashed parts even after our normalisation workaround.
We of course cant rule out ogles2 itself anyway, but i fear it can be just something in terms of how we add amiga parts to gl4es. Name clashing, nonworking aglCreateContexTags helper: at least few issues in that area already, and maybe somewhere some pointer looses, or race condition, or dunno ..
Have any idea how it even possible to debug that all to find out from where problems come ? Example with cadog imho good test case
Aha thanks, will try your idea. At least lettersfall always have trashing, so can check this out
Probably you mean that i should call initialize_gl4es(); not after, but before context creation ? Because context want IOGLES2-> , which is called from loader.c at end from , from load_lib(), which i called from initialize_gl4es().
So if i put it after, then it crashes because IOGLES2-> not initialized, but if i put it before, then it works.
Through, it make no differences. I.e. if i have it like this:
initialize_gl4es();
dprintf("Initializing GL4ES->OGLES2..\n");
hidden->IGL=IOGLES2->aglCreateContextTags(0,
OGLES2_CCT_WINDOW,(ULONG)hidden->win,
OGLES2_CCT_DEPTH,16,
OGLES2_CCT_STENCIL,8,
OGLES2_CCT_VSYNC,0,
OGLES2_CCT_SINGLE_GET_ERROR_MODE,1,
OGLES2_CCT_RESIZE_VIEWPORT, TRUE,
TAG_DONE);
Then Cadog still have problem. Once i comment out prinfs, Cadog is fine.
Through now, i can't reproduce it by pritnf("aaaa"); but that probably just because memory change the layout a bit when i build init.c with -DNO_INIT_CONSTRUCTOR.
What I mean is, if you still suspect the memory corruption comes from gl4es, you should create the context without gl4es at all. So initialize OGLES2 without gl4es, create the context, and then initialize gl4es. That way, all the first part up to context creation / context current can be don without gl4es involved. So if you still see difference with and without printf, then gl4es is not the cause.
But again, are you sure dprintf
doesn't need a file handle before the string?
Ah ok, got what you mean, will try it now.
As for dprintf, its just that for us:
So to do printfs to the serial line (so we can catch logs with PC/putty even in worse situations). That one prove to work, and as even with pure "printfs" i have issues, then it probably not related.
Will try now to initialize everything myself from SDL now.
If i tried to open library / inerface from SDL, then on linking i have errors about multiply defines of IOGLES, as amigaos.c have one. Then if i will comment out opening of library/closing in amigaos.c , then i have undef errors to those functions from loader.c and glx.c.
Can i just make them empty (i mean os4openlibs and os4closelibs) and just add "extern struct OGLES2IFace *IOGLES2 = NULL;" ?
Btw , what exactly gl4es need inside, so to be able to interract with the our ogles2 ? I mean maybe it worth to rule out everything about it outside (i.e. whole amigaos.c) , but then we need then somehow to send from SDL some poiners to gl4es ?
Yes, comment all inside os4OpenLib
and os4CloseLib
and declarig both struct external (remove the =NULL
in that case) should enough. As long as you initialize OGLES
before calling gl4es_init
it should work.
The remaining stuff in amigaos.c is required for gl4es to function.
If i do like this, then i have that output:
LIBGL: Initialising gl4es LIBGL: v1.0.5 build on Mar 17 2018 21:49:39 LIBGL: Using GLES 2.0 backend LIBGL: Hardware test diabled, nothing activated .... LIBGL: warning, gles_glGetIntegerv is NULL LIBGL: Targeting OpenGL 2.0 LIBGL: Current folder is:NO NAME:cadog-gl LIBGL: warning, gles_glViewport is NULL
And then crash
Window is open of course, context creates, etc.
Mmm, in os4OpenLib
add *lib = LOGLES2;
, that should fix the issue.
Yeah, that way works.
But issue with cadog when we have prinfs befor context creation still here :)
Question is: is it possible to have problems inside of gl4es, after we call initialize_gl4es(); ? I mean memory trashing ones ? Can it be that something was added after you do valgrind on it which may cause such efffects ? I of course pretty sure it is not gl4es , but just to rule out step by step all possible scenarios.
I do not know if there is anything now can be amiga relatd in the gl4es code which can cause issues .. We have left there only os4GetProcAddress with list left in amigaos.c , and i see ifdefs of including amigaos related code only in gl.c (some little ifs , and functions for swapbuffers) , calling of os4opnelibs from loader.c , close of os4libs in glx.c and just lookup in the lookup.c..
Not much which can cause issues ..
It's not gl4es. I don't say that because I run valgrind reguraly with gl4es (to check other stuffs), but because you have just remove everything gl4es related befaore the context creation, and you still have the issue. Why would it be gl4es, it's not use before the memory is corrupted.
Run valrgind on the Amiga if you can, you'll see gl4es as nothing to do with your memory corruption issue.
@ptitSeb I trying for now to rebuild libgl4es.a with -O0 -fno-strict-aliasing , to see if it our compiler can generate somewhere something wrong, and, while all builds fine, on linkin stage i have from list.o , that:
libgl4es.a(list.o): In function rlVertex4f': list.c:(.text+0xe838): undefined reference to
rlVertexCommon'
libgl4es.a(list.o): In function rlVertex3fv': list.c:(.text+0xe938): undefined reference to
rlVertexCommon'
libgl4es.a(list.o): In function rlVertex4fv': list.c:(.text+0xea14): undefined reference to
rlVertexCommon'
libgl4es.a(list.o): In function rlNormal3f': list.c:(.text+0xf780): undefined reference to
rlNormalCommon'
libgl4es.a(list.o): In function rlNormal3fv': list.c:(.text+0xf800): undefined reference to
rlNormalCommon'
libgl4es.a(list.o): In function rlColor4f': list.c:(.text+0xf860): undefined reference to
rlColorCommon'
libgl4es.a(list.o): In function rlColor4fv': list.c:(.text+0xf8f4): undefined reference to
rlColorCommon'
collect2: ld returned 1 exit status
makefile:60: recipe for target 'lf3' failed
If i compile list.c even without -fno-strict-aliasing error is the same. But once i replcae -O0 back on -O2, then, there is no such linking errors.
Sounds strange . Also those names of function about normalisation and vertexcommon, sounds like something about our bug.
With -O1 works ok too. Just with -O0 produce those errors on linking stage.
It's just the inline
in front of the function definition that your linker doesn't like.
Remove it and it will link fine.
And no, it's definitely not the source of your issue. Just a glitch of you GCC version, with -O0
it remove inline
function but somehow still want the inline version for linking (I could also avoid the use of inline and let the compiler decide).
Damn, you are right (as always). It starts all to be interesting when i start to hate it :))
@ptitSeb Sorry for bother again with, but we tried last week all possible scenarios , tests and ports, etc. All the time we come to some place, that some strange memory trash happens, and none of us know from where and how to detect it :(
Yesterday i buid NeverBall. It just crashes on running on glDrawElements. Daniel do checked the index array at the time of the crash. It contains about let's say 80% garbage (tons 0,0,X "triangles", lots of 0xFFFF indices; well it looks like semi-randomly trashed memory), the last maybe 20% look like somewhat valid indices inside the expected range.
And there are two glDrawElement calls before the one that crashs, those looks absolutely sane (reasonable indices etc.) and they work flawlessly.
Until the point of the crash the whole lib seems to work correctly, no sign of any lib or other coruption (and quite a lot happens under the hood until then), but then it's fed with this invalid index array and says good bye.
Of course I cannot say who's the one who really corupts it in the first place (because its coruption can also be a side-effect of something else). But I can say that it is already corupt before ogles2 does any work inside glDrawElements, it is being sent in a corupted state to ogles2.
I also tried all kind of different compiler's flags, all kind of different scenerios with SDL, etc. And all the time something weird happens with memory, and all the time its around glDrawElements.
Now, what we think about, is that possible, that we still have in gl4es some endian issues. I am not sure anymore of course , just as one more idea.. One of developers bring that kind of info on me:
Endianess problems sometimes appear in unexpected places. in AmigaOS/Exec/MakeLibrary() for example there's the table of functions which can contain function pointers (4 byte entries) or, if the first WORD in the table == 0xFFFF, then instead it contains offsets (2 byte entries). The check "if (WORD)funcInit==-1)" does not work on little endian (AROS for example), which was discovered more or less by "luck" when by pure coincidence the first function pointer happened to end exactly at address 0x????FFFF. And so the (WORD ) saw 0xFFFF there and assumed it was offsets instead of absolute addresses (a function on x86 may start at an odd address).
Or think about things like reading lower 16 bit from a 32 bit variable like this:
ULONG var = 0x12345678; ULONG ptr1 = &var; UWORD ptr2 = (UWORD *)&var;
UWORD w16 = (UWORD)ptr1; / works: -> 0x5678 / UWORD w16 = ptr2; / works too on little endian: -> 0x5678 /
Is there any place in gl4es, where we can (at least assume) that something like that can happens ?
I also may try to install ppc-linux on my hardware , and to try gl4es on it..
I don't see any places in gl4es where this kind of things can happens. Most (if not all) the conversions gl4es does are "clean", and made by macro trickery.
The indices that are 0xffff are quite easy to detect.
If src/gl/fpe.c
you can try to printf and alert if some indices are trash..
Line 369, before gles_glDrawElements(mode, count, type, indices);
add:
if(type==GL_UNSIGNED_SHORT) {
GLushort *ind = (GLushort*)indices;
for (int i=0; i<count; i++)
if(ind[i]==0xffff)
printf("WARNING: Indices[%d] is 0xffff\n", i);
}
or dprintf
if it's better for you (evend the %d
and ,i
are not mendatory). That can help find the cause (if that happens, do a run with debug enabled in fpe.c
, in case there is something obvious).
I run with debug fpe.c before as well, there is output: http://kas1e.mikendezign.com/aos4/gl4es/games/neverball/neverball_fpe_debug.txt
That from the running, till crash.
Not much to say regarding the trace.
It's program number 275 (starting from 256), 20th program? Can that be an issue (I don't think so)? You should activate also debug in shaderconv.c
in case that program as something specific.
Also, the glDrawElements
is the "biggest" in the trace, with 5550 vertex (other are "only" 3072 vertex).
So yeah, not much to say.
Added that printf with 0xffff : nothing found. At least before crash i see no prinfs at all ..
What else we have after gles_glDrawElements(mode, count, type, indices); call, and before actual stuff go to amiglDrawElements ? Maybes it worth to add that prinfs also in amigaos.c , before actuall call to ogles2 ?
That gles_glDrawElements(mode, count, type, indices);
is the actual call to the GLES2 driver. So it's directly AmiglDrawElements
at this stage, that call OGLES2->glDrawElements
All direct call.
You can add the 0xffff check in AmiglDrawElement
if you want, it's in src/agl/amigaos.c
, line 213...
Yeah, already, with no luck .. I.e. nothing printf. Wtf .. First time see that kind of random strange issue. It didn't out from gl4es in bad form, and already recieved in ogles2 library in bad form. Wtf :))
It would be interresting to have the same kind of test inside OGLES2, and also see what are the address of those 0xffff value (to compare with the beginning of the indices
array).
As daniel say those 0xffff was just random, which may or may not be on my setup, and even on his it not everytime like this..
Can we add prinfs, which will spits out the count / pointer etc. at our glDrawElements like we did above and then simply also print out the, let's say, first 100 ushort indices, comma seperated ?
Then check values that are suspicusly high, like
if(type==GL_UNSIGNED_SHORT) {
GLushort *ind = (GLushort*)indices;
for (int i=0; i<count; i++)
if(ind[i]>0x2000)
printf("WARNING: Indices[%d] is 0x%x\n", i, ind[i]);
}
Now if you prefer the comma sepated list (with max value as added bonus):
if(type==GL_UNSIGNED_SHORT) {
GLushort *ind = (GLushort*)indices;
GLushort m = 0;
printf("Indices=");
for (int i=0; i<count; i++) {
if(ind[i]>m) m = ind[i];
if(i<100)
printf("%d%c", ind[i], i?',':' ');
}
printf("\nThere are %d indices, max value is %d\n", count, m);
}
Aha thanks, done. That what i have when run neverball before crash happens:
http://kas1e.mikendezign.com/aos4/gl4es/games/neverball/neverball_print_indices.txt
That what i have when run neverputt before crash happens:
http://kas1e.mikendezign.com/aos4/gl4es/games/neverball/neverputt_print_indices.txt
And that what i have, when i play with stack settings , and neverputt are runs when i set low stack (lower than 65k):
On amigaos4 we have ability to control size of stack which will be used when we run programms, by default we have 65.535, but we can raise it to any size. What is strange is that when i LOWER stack size , then neverputt at least runs (see 3st output). But when i make stack bigger, it crashes right away on start a neverball.
Interesting..
Mmm, those indices are indeed really wrong. I need to think a bit and will ask for more logs...
Hi ptitSeb ! :)
Sorry for bother you with another issue which very well maybe not related to gl4es itself, but while waiting for fixes in our drivers in amigaos4, i give a go and port IrrLicht engine as well over gl4es. So all compiles, links fine. But once i run some simple test case (which works of course via software rendering, etc), it crashes in the AmglGetIntergerv().
I.e. it should then come with GLSL checking, and have words "GLSL not available" (at least that i have on legacy opengl), or available (probably that it should be with gl4es ?). But instead it crashes:
4/0.Work:irrlicht/bin/> 01.HelloWorld_gl4es LIBGL: Initialising gl4es LIBGL: v1.0.5 built on Mar 2 2018 01:33:59 LIBGL: Using GLES 2.0 backend LIBGL: OGLES2 Library and Interface open successfuly LIBGL: Hardware test disabled, nothing activated... init_matrix(0x6b0b1bb0) LIBGL: Targeting OpenGL 2.0 LIBGL: Current folder is:/Work/irrlicht/bin/ Irrlicht Engine version 1.9.0 SDL initialized SDL Version 1.2.15 Using renderer: OpenGL 2.0 GL4ES wrapper: ptitSeb OpenGL driver version is 1.2 or better. << CRASH>>
I am almost sure, that its again some problems in our ogles2 driver (as i can see in log, that it crashes in the "AmiglGetIntegerv()", but maybe (only maybe), it can be something in gl4es as well ? Its even didn't throw any debug output from gl4es, as it crashes seems too early.
There is crashlog: http://kas1e.mikendezign.com/aos4/gl4es/irrlicht/crashlog_irrlicht_helloworld.txt
Maybe you have some ideas what it can be .. Thanks !