Closed icecream95 closed 5 years ago
Turns out it's related to a "fix" I made, disabling VBO's.
I've re-enabled them, so time to see how long it is before the next crash...
Well, with VBOs enabled, I'm getting this crash:
Thread 9 (Thread 0xaa4ff130 (LWP 19931)):
#0 0xb4da032c in waitpid () from /usr/lib/libpthread.so.0
No symbol table info available.
#1 0x00ceb7b0 in crash_catcher(int, siginfo_t*, void*) ()
No symbol table info available.
#2 <signal handler called>
No symbol table info available.
#3 0xb6db8554 in osg::GLBufferObject::compileBuffer() () from /usr/local/lib/libosg.so.158
No symbol table info available.
#4 0x3f613660 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Time to recompile BufferObject.cpp from OSG with debugging symbols...
It seems to segfault (with VBOs enabled) at this line in OpenSceneGraph/src/osg/BufferObject.cpp
:
const osg::Image* image = entry.dataSource->asImage();
That's all I can say at the moment as konsole crashed before I could do any more investigation.
So the first crash is a false alarm?
Now, a crash in osg::Image... do you know the version of OSG you are using (that const osg::Image* image = entry.dataSource->asImage();
is from the function osg::GLBufferObject::compileBuffer()
?
I checked here: https://github.com/openscenegraph/OpenSceneGraph/blob/master/src/osg/BufferObject.cpp
The function is basically just doing a few glBufferData / glBufferSubData, wich in gl4es simply translate to a few memcpy (with some sanity checks). This doesn't seems harmfull. you probably need to add debug info to BufferEntry
sources to have more crash details.
Also, if you want details on what gl4es is doing, you can uncomment this line https://github.com/ptitSeb/gl4es/blob/master/src/gl/buffers.c#L11 (that may turn out to be too chaty, I don't know)
Konsole just crashed again - I really need to start using screen or tmux.
It looks like disabling VBO was only for the GUI (the hud, menus), which explains why the performance didn't absolutely tank - it seems about the same.
So the first crash is a false alarm?
Maybe, it looks like I don't get the second crash with GUI VBO disabled, so I'd be happy with just a workaround for the first crash.
I'm using OSG 3.6.3, commit d011ca4e
, which is from September last year, so maybe I should try master.
Annoyingly, the second crash is quite rare, so I need to play a lot before it crashes, and it doesn't seem to be very reproducible, so I'd better get my finger ready on the bound longsword hotkey...
I think [the second crash] could be a race between OSG and the OpenMW GUI - openmw-android has some patches for thread races, which I previously dismissed as just for keeping Android happy, but maybe they actually have a function.
I think in the first crash there is some memory overwriting. Local Stack variable and parameters seems to have strange value, if I can trust the callstack difference bewteen @entry
and current values of them.
A race condition can happens yes. GL4ES is not thread safe for now. I should work on that, but I'm a bit afraid of the slowdown this will bring.
Okay, so now I've got the second crash caught in gdb, and konsole is still running.
Thread 9 "openmw" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xaa3ff130 (LWP 24568)]
osg::GLBufferObject::compileBuffer (this=0x9b4e5cf0) at /openmw/OpenSceneGraph/src/osg/BufferObject.cpp:134
(gdb) list
129
130 unsigned int bufferAlignment = 4;
131
132 unsigned int newTotalSize = 0;
133 unsigned int i=0;
134 for(; i<_bufferObject->getNumBufferData(); ++i)
135 {
136 BufferData* bd = _bufferObject->getBufferData(i);
137 if (i<_bufferEntries.size())
138 {
(gdb) p _bufferObject
$1 = (osg::BufferObject *) 0x0
(gdb) p *this
$2 = {<osg::GraphicsObject> = {<No data fields>}, _contextID = 0, _glObjectID = 4540, _profile = {_target = 0, _usage = 0, _size = 0}, _allocatedSize = 3072, _dirty = false,
_bufferEntries = std::vector of length 1, capacity 1 = {{numRead = 0, modifiedCount = 16777215, dataSize = 3072, offset = 0, dataSource = 0x788bd60}}, _bufferObject = 0x0,
_set = 0xa0d17168, _previous = 0x9a1cc8b0, _next = 0x0, _frameLastUsed = 8960, _extensions = 0xaa5006a8}
(gdb) bt 5
#0 osg::GLBufferObject::compileBuffer (this=0x9b4e5cf0) at /openmw/OpenSceneGraph/src/osg/BufferObject.cpp:134
#1 0xb6df4b4c in osg::DrawElementsUShort::draw(osg::State&, bool) const () from /usr/local/lib/libosg.so.158
#2 0xb6df4b4c in osg::DrawElementsUShort::draw(osg::State&, bool) const () from /usr/local/lib/libosg.so.158
#3 0xb6df4b4c in osg::DrawElementsUShort::draw(osg::State&, bool) const () from /usr/local/lib/libosg.so.158
#4 0xb6df4b4c in osg::DrawElementsUShort::draw(osg::State&, bool) const () from /usr/local/lib/libosg.so.158
(More stack frames follow...)
Not much more to see here without symbols for DrawElementsUShort::draw
.
Here is DrawElementsUShort, from PrimitiveSet.cpp
:
250 │ void DrawElementsUShort::draw(State& state, bool useVertexBufferObjects) const
251 │ {
252 │ GLenum mode = _mode;
253 │ #if defined(OSG_GLES1_AVAILABLE) || defined(OSG_GLES2_AVAILABLE) || defined(OSG_GLES3_AVAILABLE)
254 │ if (mode==GL_POLYGON) mode = GL_TRIANGLE_FAN;
255 │ if (mode==GL_QUAD_STRIP) mode = GL_TRIANGLE_STRIP;
256 │ #endif
257 │
258 │ if (useVertexBufferObjects)
259 │ {
260 │ GLBufferObject* ebo = getOrCreateGLBufferObject(state.getContextID());
261 │
262 │ if (ebo)
263 │ {
264 │ state.getCurrentVertexArrayState()->bindElementBufferObject(ebo);
265 │ if (_numInstances>=1) state.glDrawElementsInstanced(mode, size(), GL_UNSIGNED_SHORT, (const GLvoid *)(ebo->getOffset(getBufferIndex())), _numInstances);
266 │ else glDrawElements(mode, size(), GL_UNSIGNED_SHORT, (const GLvoid *)(ebo->getOffset(getBufferIndex())));
267 │ }
268 │ else
269 │ {
270 │ state.getCurrentVertexArrayState()->unbindElementBufferObject();
271 │ if (_numInstances>=1) state.glDrawElementsInstanced(mode, size(), GL_UNSIGNED_SHORT, &front(), _numInstances);
272 │ else glDrawElements(mode, size(), GL_UNSIGNED_SHORT, &front());
273 │ }
274 │ }
275 │ else
276 │ {
277 │ if (_numInstances>=1) state.glDrawElementsInstanced(mode, size(), GL_UNSIGNED_SHORT, &front(), _numInstances);
278 │ else glDrawElements(mode, size(), GL_UNSIGNED_SHORT, &front());
279 │ }
280 │ }
Any ideas?
with a NULL _bufferObject, not much can be done...
I guess you can simply change line 134 from for(; i<_bufferObject->getNumBufferData(); ++i)
to for(; _bufferObject && i<_bufferObject->getNumBufferData(); ++i)
to workaround the issue, but it would be better to track why there this NULL buffer that is used. But tracking it may proved tricky, as you probably needs to put a (conditionnal) breakpoint in the the assign
method of BufferObject
and check the callstack to track that issue.
@icecream95 for my builds of openmw for Android, I use latest OSG + openmw optimized patches (which you can find here.). It might fix it, but don't count on it.
By the way @terabyte25, what sort of crash was this MR meant to fix? It's only caused extra problems for me...
I just compiled OSG 78f6b293 (from the branch terabyte linked to) and OpenMW master, and I've now figured out the point of the "Fix red sky after shadows" patch of openmw-android. :)
It does look cool, I must admit...
Someone should tell the OpenMW UI designers that that font doesn't go will with the flat UI.
(or maybe I just rsync'd the OpenMW build in the wrong direction and stopped it a little too late...)
Bah, Cursive-like font for an Elder Scroll game makes sense :p
Whoops, and now I ran out of space due to doing a RelWithDebInfo build.
Obviously, once it's built it won't crash anymore, or otherwise why would Sotha Sil be trying so hard to stop me?
It looks like the newer OSG and/or the OpenMW specific OSG patches have fixed the problem.
I might get round to bisecting it someday... probably... maybe...
As an added bonus, I seem to be getting better performance, as well as a lot less stuttering (shader compiling?) with the GLES 2 backend, so I'm now enjoying some terrain reflections on my water.
Nice! Thanks for the follow-up on this :)
I'm getting problems with OpenMW crashing every 30 minutes or so.
The problem is reproducible - when it crashes soon after saving, if I load that save and repeat the same actions it crashes again.
Here is a typical crash log:
In doing some investigation with gdb (this is a different OpenMW instance to the crash log above):
Looking at
glstate->vao->pointers
, all of the objects in that array with apointer
that dereferences to something have theenabled
flag set to 0, and none of those withenabled
0 with a non-NULLpointer
have memory that can't be accessed.How should I proceed in debugging?
I know C, but am not very experienced in gdb.
I'm going to try compiling more of OpenMW and OSG with debugging symbols, to see if I can find out more about the issue...