karlheyes / icecast-kh

KH branch of icecast
GNU General Public License v2.0
300 stars 107 forks source link

Crashing with relay and fallback usage #393

Closed brane0690 closed 1 year ago

brane0690 commented 1 year ago

v17.1 is crashing when relay and fallback file are used together. It won't usually crash immediately. It will crash usually when fallback action happens or client tries to connect while fallback file is active. If relay is set "on-demand 1" it will crash faster, but also with "on-demand 0" it will crash after some time on fallback action. This wasn't problem on v15. I didn't tested other versions. Systems tested: Centos 7 and Rocky 8 - same problem.

gunsar commented 1 year ago

try to test the latest version https://github.com/karlheyes/icecast-kh/releases/tag/icecast-2.4.0-kh18

brane0690 commented 1 year ago

I've tested again.

kh15 - works!

kh17.1 - crash! kh18 - crash!

kernel: icecast[54473]: segfault at 10 ip 00007efec3893570 sp 00007efebfa1ab78 error 6 in libpthread-2.17.so[7efec3887000+17000]

It mostly happens when client tries to listen while fallback file is looping.

I always reproduce error with this steps:

  1. mountpoint A is relaying other server B.
  2. Disconnect server B from source.
  3. mountpoint A is now looping fallback file.
  4. Client tries to connect to mountpoint A.
  5. Client listens looping fallback file
  6. Client disconnects and wait cca 10 seconds
  7. Client tries to to connect to mountpoint A again.
  8. Timeout
  9. Client tries again - success.
  10. Client disconnect and connect - server crash!

This happens in 70% of tries.

karlheyes commented 1 year ago

A bit vague without detail of the error log. I presume point 7/8 is the relay trying to reconnect and point 9 is is acquiring the fallback file?

karl

brane0690 commented 1 year ago

Since point 2 relay is trying to reconnect, all the time. I don't connect source on remote server B anymore. Yes, at point 9 the client is listening the fallback file.

Interestingly, at point 8, version kh15 also experiences delay in response when client tries to listen, but it doesn't timeout (and doesn't crash server later). Seems like it was sleeping for a few seconds.

karlheyes commented 1 year ago

Can you recheck with kh18.1. I want to rule out some other issue first. I have tried what you suggested a few times and it has not crashed on me but it could be dependant on load and setup of software/hardware. If it still occurs, do a make debug and then run catchsegv icecast-c .... and try again, let me know the output.

karl

brane0690 commented 1 year ago

I've tested kh18.1 - same crash I didn't manage to use catchsegv... once Icecast crashes, /tmp file disappears. I've also tested kh16 - doesn't crash, but it doesn't play fallback file at all. So the last working version is kh15.

karlheyes commented 1 year ago

can you get any sort of core dump at all? just something to indicate where the problem faults, even running it in gdb. It does look like catchsegv has been removed from later glibc which is annoying. Do a make clean debug before running in gdb (gdb --arg icecast -c ...xml) run and on crash you can use gcore

gunsar commented 1 year ago

@karlheyes

today I just felt something strange on kh18, when the server was relayed off and then on again, the audio couldn't be played. after looking at the icecast status, the audio type changes automatically to video/mp4, even though the relay is audio/aac

Screenshot 2023-02-08 073909

**edit after restarting the icecast service, then the relay runs normally again. (I didn't make a follback mount, only relay)

karlheyes commented 1 year ago

hmm, not heard of that one before. It can be possible for the mpeg parsing to detect something like mp4, but I don't believe it would manifest in setting the stats as such. Only as, oh it's different, hmm looks like mp4 now, I won't parse and allow it to feed through unaligned. Not a combo I've tried so I would need to see the log level 4 to see in more detail. You may be seeing m4a but hard to say from just this.

gunsar commented 1 year ago

I'm also surprised this is the first time I've experienced something like this. it's very strange that it can be read as an MP4 video and the audio is stuck, even though the relay on icecast status is still readable, even though the original is relayed by AAC. after restart icecast-service back to normal. the relay server uses icecast-kh17

gunsar commented 1 year ago

log info :

[2023-02-08 07:25:59] WARN format-mp3/validate_mpeg source /test, len 1400, unprocessed 1365 [2023-02-08 07:25:59] WARN format-mp3/validate_mpeg source /stream, len 1400, unprocessed 1365 [2023-02-08 07:25:59] WARN format-mp3/validate_mpeg source /test, len 1400, unprocessed 2126 [2023-02-08 07:25:59] WARN format-mp3/validate_mpeg source /stream, len 1400, unprocessed 2126 [2023-02-08 07:25:59] WARN format-mp3/validate_mpeg source /test, len 1400, unprocessed 3126 [2023-02-08 07:25:59] WARN format-mp3/validate_mpeg source /stream, len 1400, unprocessed 3126 [2023-02-08 07:25:59] WARN format-mp3/validate_mpeg source /test, len 1400, unprocessed 4126 [2023-02-08 07:25:59] WARN format-mp3/validate_mpeg source /stream, len 1400, unprocessed 4126 [2023-02-08 07:26:00] INFO mpeg/check_for_aac Detected AAC MPEG-4, 22050Hz 2 channel(s) 64 kbps on localhost [2023-02-08 07:26:00] INFO mpeg/check_for_aac Detected AAC MPEG-4, 22050Hz 2 channel(s) 64 kbps on localhost

karlheyes commented 1 year ago

can you let me know what the state of these issues are with kh18.3

karl

gunsar commented 1 year ago

can you let me know what the state of these issues are with kh18.3

karl

I'll try it, I'll let you know if there's a problem. Thank You

gunsar commented 1 year ago

@karlheyes audio relay crashes or fallbacks don't happen again in version kh18.3, but for relays after restarting the VPS the metadata is not readable on the player but the audio runs normally. after restarting the icecast service, the metadata reads normally again on the player.

karlheyes commented 1 year ago

can you send me the error log level 4 over the VPS restart? I want to get a handle on what your setup is and what the conditions presenting are. metadata should still be requested on a relay restarting unless explicitly disabled in the xml. I've tried here on restarting a master where the relay has a delay and the metadata comes through.

karl

gunsar commented 1 year ago

I tried again: the icecast relay from a different VPS server is correct for the metadata delay

icecast relay from the same VPS (local relay), metadata doesn't appear after restarting the vps even though it's a long wait, the mountpoint name still appears on the player. after restarting icecast service back to normal

karlheyes commented 1 year ago

that does not tell me much. is the metadata coming in to icecast or not, does it appear on the status page? is the player reconnecting at all? is the player being bounced to another mountpoint when the VPS source disappears. Does the relay retry and if so when. Are there any errors/warnings in the parsing. These are all key aspects that need to be made clear.
What is the difference between the 2 servers you tried? what is the long wait?

karl

gunsar commented 1 year ago

Two different servers means: The icecast that I use on this server (server A) relays audio from icecast from a different place (server B). this metadata runs fine after restarting the VPS which I did the v18.3 experiment, only the metadata waited a few seconds to appear.

while from the same server (Server A) audio relay from icecast on (server A) too or relay from localhost. after the VPS restarts the metadata doesn't appear. I waited for a long time (more than 10 minutes) the metadata still didn't appear. after restarting the service, the metadata appears properly without having to wait long. relay from this same server vkh18.3 (icecast-kh standalone) relay audio from vkh18 (icecast-kh from azuracast docker)

gunsar commented 1 year ago

[2023-03-02 12:12:15] INFO main/server_process Icecast 2.4.0-kh18.3 server started [2023-03-02 12:12:15] INFO client/workers_adjust requested worker count 1 [2023-03-02 12:12:15] INFO client/worker_start starting incoming worker thread 0x557806085cc0 [2023-03-02 12:12:15] INFO client/log_commit_thread started [2023-03-02 12:12:15] DBUG slave/update_relays assess relay /stream [2023-03-02 12:12:15] DBUG slave/relay_copy copy relay /stream at 0x5578060863a0 [2023-03-02 12:12:15] DBUG slave/relay_installed adding relay client for /stream (0x5578060863a0)

no error

karlheyes commented 1 year ago

vps restarts?, you mean the feed for the stream ends and how are the mountpoint configured? are there any fall backs? Your error log says nothing about a stream ending or timing out or any listeners on the different streams

assuming you have your listener on /stream and /stream relays another mountpoint eg /feed on the same icecast, whats is /feed, another relay and are there any fallbacks listed on either of these mountpoints. If none are defined and /feed ends then any listeners on /stream should also end. If /feed is off and reconnects after a while is that getting any metadata correctly? Are either of these an on-demand relay? I presume they are not a multi-host relay. The more info I can see on the setup and what you are experiencing then the more I can nail down the issue.

karl.

gunsar commented 1 year ago

these are my settings in the same VPS

Icecast kh18 (AzuraCast) Stream URL https://sub.domain.com:8000/stream

Icacast kh18.3 (LinuxVersion) Stream URL https://sub.domain.com:8600/test icecast.xml settings:

<relay> <server>localhost</server> <port>8000</port> <mount>/stream</mount> <local-mount>/test</local-mount> <relay-shoutcast-metadata>1</relay-shoutcast-metadata> </relay>

<mount> <mount-name>/test</mount-name> <fallback-mount>/fallback-[64].aac</fallback-mount> <fallback-override>1</fallback-override> <charset>UTF-8</charset> <stream-name>Test Relay ReStream</stream-name> <stream-description>Test Relay ReStream</stream-description> <stream-url>https://sub.domain.com</stream-url> <genre>Top40</genre> <bitrate>64</bitrate> </mount>

<mount> <mount-name>/stream</mount-name> <fallback-mount>/test</fallback-mount> <fallback-override>1</fallback-override> <charset>UTF-8</charset> <stream-name>Test Relay ReStream 2</stream-name> <stream-description>Test Relay ReStream 2</stream-description> <stream-url>https://sub.domain.com</stream-url> <genre>Top40</genre> <bitrate>64</bitrate> </mount>

After restarting the VPS, no metadata is displayed, but the audio is successfully relayed and the sound matches the audio from port 8000 (azuracast). but after restarting icecast-service LinuxVersion the metadata is showing

karlheyes commented 1 year ago

that looks like an ugly loop type of thing there. You are relaying a mountpoint with a fallback to the mountpoint you are relaying to and have override settings. I'll have to run some tests to check that path specifically but multiple mountpoints with common fallbacks were never a reliable use case and here you have a possible relay loop. Separate isolated fallbacks are what people should use in such cases.

karlheyes commented 1 year ago

@gunsar the master should now have the fix for no metadata appearing. The issue was that it was starting with a fallback to file and the full initialisation was missing. The issue of the relay loop is still there but you should not expect that to work at all.

karl.

gunsar commented 1 year ago

@karlheyes I tried it new master f0eb586 , it's running as expected, it looks like the reading of the fallback file is now not placed at the front, so the metadata is automatically relayed. the experiment was successful, thank you.

gunsar commented 1 year ago

a few minutes later, sometimes the audio crashes and stops for a while. but after restarting the service it returns to normal, how many seconds does it actually take to read the fallback file, fallback mount and relay alternately?

gunsar commented 1 year ago

I asked how many seconds above, because the relay sometimes drops and the audio immediately changes to the play fallback file, even though the relay server is from localhost (in the same 1vps)

karlheyes commented 1 year ago

the fallback file just has a initial scan for format detection once at the start, it does not read all of it or keep a copy in memory. Are you still attempting the relay bouncing thing as that won't work well enough. A separate fallback file reference on the secondary mount (which could be a symlink to the same file) will stop the reading end getting thrown back onto itself.

gunsar commented 1 year ago

ok i will try to add fallback with different file and different mount.

gunsar commented 1 year ago

is it maybe something to do with the cors problem? I really feel strange, why does the localhost relay but sometimes it disconnects from the relay and gets stuck which ends up reading the fallback file. even though CPU RAM usage is still normal 20%

gunsar commented 1 year ago

Now setting (add mount autodj, autodj with liquidsoap) :

<relay> <server>localhost</server> <port>8000</port> <mount>/stream</mount> <local-mount>/test</local-mount> <relay-shoutcast-metadata>1</relay-shoutcast-metadata> </relay>

<mount> <mount-name>/autodj</mount-name> <fallback-mount>/fallback.mp3</fallback-mount> <fallback-override>1</fallback-override> </mount>

<mount> <mount-name>/test</mount-name> <fallback-mount>/autodj</fallback-mount> <fallback-override>1</fallback-override> <charset>UTF-8</charset> <stream-name>Test Relay ReStream</stream-name> <stream-description>Test Relay ReStream</stream-description> <stream-url>https://sub.domain.com</stream-url> <genre>Top40</genre> <bitrate>64</bitrate> </mount>

<mount> <mount-name>/stream</mount-name> <fallback-mount>/test</fallback-mount> <fallback-override>1</fallback-override> <charset>UTF-8</charset> <stream-name>Test Relay ReStream 2</stream-name> <stream-description>Test Relay ReStream 2</stream-description> <stream-url>https://sub.domain.com</stream-url> <genre>Top40</genre> <bitrate>64</bitrate> </mount>

gunsar commented 1 year ago

now a bit running normally. the fallback file I added is an .mp3 file not an .aac file. I thought because relay uses aac+ so follback also has to add an aac file. from monitoring now it's still normal.

karlheyes commented 1 year ago

I did an extra clear out of the relay if the relay is in a delayed retry state, just in case people switch formats in-between so it would be possible to change between mp3 and aac without issue in those cases, the issue is purely the format changing mid-stream.

gunsar commented 1 year ago

very true that the format changes when the stream runs after changing mounts. now running normally for metadata after restart vps. follback mount from autodj seems to be better with the autodj settings the same as the stream format from the relay. Thank You

karlheyes commented 1 year ago

kh20 out now. Will close this for now.