kcat / alure

Alure is a utility library for OpenAL, providing a C++ API and managing common tasks that include file loading, caching, and streaming
zlib License
70 stars 20 forks source link

Invalid Source ID (getSampleOffsetLatency) #32

Closed nathanleighsays closed 5 years ago

nathanleighsays commented 5 years ago

I've been hunting a peculiar bug all evening that causes periodic segmentation faults when multiple sources are playing. The exit message is always "AL lib: (WW) Error generated on context 0x555555617890, code 0xa001, "Invalid source ID 0""

I loaded up my program in gdb, and it appears to be an issue with Alure, not al-soft. Apologies for any formatting weirdness, the program I'm writing uses ncurses so the formatting in gdb is all over the place. I tried my best to clean it up.

Thread 14 "ncue" received signal SIGSEGV, Segmentation fault. Switching to Thread 0x7fffe4a46700 (LWP 1136) 0x00007ffff7e4c61e in alure::SourceImpl::getSampleOffsetLatency() const () from /usr/local/lib/libalure2.so Single stepping until exit from function _ZNK5alure10SourceImpl22getSampleOffsetLatencyEv, which has no line number information.

On another crash, I got the following:

Thread 11 "ncue" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffcaffd700 (LWP 23024)] __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:500 500 ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.

kcat commented 5 years ago

How many sources do you mean by "multiple"? Just a few, a few dozen, or a few hundred? It trying to use source ID 0 indicates either there's a missing check for a non-"allocated" source, or a source was preempted by another and still tried to use its source ID despite not having one anymore.

Are you able to make a debug build of Alure and get backtraces from that? Also, is the source for your program available (at least the portion that uses Alure)?

nathanleighsays commented 5 years ago

It seems to be with 3 or more.

I can try to make a debug build of Alure. I admit, I'm very much a novice programmer, so I don't totally know how, but I'll look around and see if I can figure it out. This project itself started as something to teach myself C++ (as such the source code is a total mess and full of works-in-progress, dead-end-ideas and inconsistent formatting as I realized a better way to do things mid-way through and haven't gotten around to going back and fixing earlier pieces, but I'm happy to share it in its current state).

The offending bits are during the go_sound function. Every attempt to comment out various lines between creating and destroying the source has had no impact as best I can tell. The only thing I can figure out is that it seems to occur most often when one sound ends as another playing in another thread calls ctx.update().

It's a theatrical sound playback program, so there are some unconventional uses of Alure I'm sure, since the end user is ultimately the one choosing the sounds that get played and where, when, how long and how often they're triggered.

I built with the latest openal-soft last night and still got the issue.

ncue.zip

kcat commented 5 years ago

The offending bits are during the go_sound function. Every attempt to comment out various lines between creating and destroying the source has had no impact as best I can tell. The only thing I can figure out is that it seems to occur most often when one sound ends as another playing in another thread calls ctx.update().

I haven't been able to take a good look at your source yet, but this statement brings up a possible problem. Alure isn't really thread-safe. If you want to use it from multiple threads, you need to use a mutex or something similar to ensure only one thread is calling into Alure at a given time.

nathanleighsays commented 5 years ago

Ahhhh. That actually makes a lot of sense. When it's only one thread it's stable as a rock, but that would explain why it happens when multiple threads are running. I'll play around and see if I can work around it. Thank you!

On Sat, Oct 26, 2019, 6:05 PM kcat notifications@github.com wrote:

The offending bits are during the go_sound function. Every attempt to comment out various lines between creating and destroying the source has had no impact as best I can tell. The only thing I can figure out is that it seems to occur most often when one sound ends as another playing in another thread calls ctx.update().

I haven't been able to take a good look at your source yet, but this statement brings up a possible problem. Alure isn't really thread-safe. If you want to use it from multiple threads, you need to use a mutex or something similar to ensure only one thread is calling into Alure at a given time.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kcat/alure/issues/32?email_source=notifications&email_token=AL6D3YOOQELU5F5LEFXKSLDQQS5JPA5CNFSM4JFLQEVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECKR2MI#issuecomment-546643249, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL6D3YNAPAN6USE2MBOVML3QQS5JPANCNFSM4JFLQEVA .

nathanleighsays commented 5 years ago

Looking into how I'd implement mutex, is it just the ctx.update() calls that need to be protected or will I need to secure all calls to buffers and sources as well? I have a lot of source calls that need to be pretty precise timing-wise, though I think I can use the source's offset clock to keep things synced if I need to, it would just mean re-thinking some elements. And I guess I need to re-work how I deal with buffers anyway, it's pretty clunky.

And thank you, by the way, for talking me through this. Alure is the first cross-platform audio library I've found that feels beginner-friendly, and just the process of reading through the headers to figure out how to do what I want to do has taught me more than any tutorial I've followed.

On Sat, Oct 26, 2019, 7:04 PM Nathan Leigh nathanleighbooking@gmail.com wrote:

Ahhhh. That actually makes a lot of sense. When it's only one thread it's stable as a rock, but that would explain why it happens when multiple threads are running. I'll play around and see if I can work around it. Thank you!

On Sat, Oct 26, 2019, 6:05 PM kcat notifications@github.com wrote:

The offending bits are during the go_sound function. Every attempt to comment out various lines between creating and destroying the source has had no impact as best I can tell. The only thing I can figure out is that it seems to occur most often when one sound ends as another playing in another thread calls ctx.update().

I haven't been able to take a good look at your source yet, but this statement brings up a possible problem. Alure isn't really thread-safe. If you want to use it from multiple threads, you need to use a mutex or something similar to ensure only one thread is calling into Alure at a given time.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kcat/alure/issues/32?email_source=notifications&email_token=AL6D3YOOQELU5F5LEFXKSLDQQS5JPA5CNFSM4JFLQEVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECKR2MI#issuecomment-546643249, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL6D3YNAPAN6USE2MBOVML3QQS5JPANCNFSM4JFLQEVA .

kcat commented 5 years ago

All buffers and sources too. Alure does multiple things per call, so it can't rely on OpenAL's thread-safety to work. If Alure was thread-safe it would essentially need to use a mutex internally, though by the caller doing it, a single mutex can cover multiple calls as needed.

nathanleighsays commented 5 years ago

OK. So I need to re-write some of my syncing code to handle syncing over long music cues, but it appears that adding mutexes has completely solved the issue. Thank you! I'll consider this issue closed.