Open DroidU opened 1 year ago
a couple of other lock imbalances fixed. the master should be ok for that.
karl
kh20.1: If there is 1 connection to the server, it is working fine. But for multiple parallel connections after reload it still does not respond.
did you take from the master tree or kh20.1
karl
I downloaded it from: https://github.com/karlheyes/icecast-kh/tree/icecast-2.4.0-kh20.1 -> Code -> Download ZIP
This is working fine.
@karlheyes please rebuid release/update of new master for windows version, i will give it a try, i opened https://sourceforge.net/projects/icecastkh/ still old kh20 version (7 days ago) before latest master update. Thank You
I've cut a pre-release kh20.2 with windows builds as well.
karl.
thanks, I'll try it
@karlheyes when downloading the icecast-2.4.0-kh20.2 windows64 version there is a notification of a dangerous file, why is that?
20.3 is up with a bunch of fixes. It has worker changes so can be a more critical change but looks to be solid and improves
in certain areas.
no idea on any dangerous file message. The dll files are sent out via distribution and haven't changed since december and the binary is built by me.
karl
my experiment result: 20.3 for linux, windows, azuracast ansible, running fine but there is a problem for Icecast Directory, but linux/azuracast ansible version after update from master and running normally 20.3 for CentovaCast it crashes and the station offiles after 4-6 hours, don't know what the problem is, even though version 20.1 is normal, but versions 20.2 and 20.3 crash and the station suddenly goes offline. kh17 version is still best for CentovaCast
Unfortunately, it is the same for us. The kh-20 sometimes crashes and I can't reproduce this. We use MSCP Pro, where the Icecast2 capability is fully utilized. Also among them is OggFLAC support. Unfortunately, because of this, bugs are more likely to appear.
can I get a sitrep on things how they stand with 20.5. There are windows binaries uploaded as well.
karl
I now install and try it 20.5 I will monitor this trial version on linux, windows, azuracast and centova, thanks karl
it's been 3 days of trial period for linux + windows (icecast-2.4.0-kh20.5) running normally without problems also testing on centova + azuracast ansible (icecast-kh master) runs normally without problems thank you @karlheyes I will continue to be monitored
kh21 is up.
karl
@karlheyes after 2 weeks trial kh21. it turned out that the problem was present again for the relay, the incident repeated again like the kh20 version. after several times the source server relays off and then runs again, the server does not function to relay and icecast off. i experienced the same thing on trials on azuracast and centova. back to kh17 version back to normal.
@karlheyes Hi Karl, we are also facing similar behavior during fallback on relay master server. It seems that icecast service is running partially (icecast logs are generated but the web services like admin interface and connection over http is not possible -- it stucks and the CPU Load is goes to 100%). Icecast service reload doesn't help and we need to restart the service to bring it up again.
Many thanks for your support.
@onur58 what version are you using? have you tried version 21.2? I experimented on 1VPS, it still runs normally version 21.2 to CPU and RAM
@gunsar We are using the latest offical release 2.4.0-Kh21.0.
@onur58 maybe you can just try updating to version 21.2 or 21.4.
@gunsar the crash behavior is fixed in 21.2 or 21.4.? your last comment was rollback to kh17 :-)
I'll be cutting a 21.5 shortly with a reload fix from 21.4. the rest being related to windows test runs.
karl
@gunsar the crash behavior is fixed in 21.2 or 21.4.? your last comment was rollback to kh17 :-)
@onur58
That's right, when I was experimenting, it was still the kh21.0 version and I returned to the kh17 version because there were lots of problems, after I immediately saw that there was an improvement by @karlheyes in the kh21.2 version, I tried it and it's still running normally until now on icecast linux and centova panels. for azuracast I tried version kh21.3 and it still runs normally. now there is version kh21.5 i will try the widows version
@karlheyes many thanks for the quick fix. I will deploy 21.5 tonight @gunsar thank you also for your support :-)
@karlheyes Hi Karl, I deployed the 21.5 last week and during the weekend we faced a network issue on the source therefore we were not able to fallback on the relay master server (both sources where not reachable from relay master). In this scenario the edge icecast servers tries to connect the relay master and during few minutes the edge icecast servers are not reachable (icecast web service socket timeout): If I check the service there was no reload or restart of icecast service and the load and cpu usage was not high.
I saw in the meantime you released kh21.6 what do you mean with expand on the relay switchover failure case handling?
many thanks for your support.
The relay switchover is code when you define multiple hosts in a relay eg
<relay local-mount="/stream" on-demand="no">
<host priority="1" ip="127.0.0.1" port="12000" mount="/internal" />
<host priority="2" ip="192.168.1.10" port="7000" mount="/stream" />
</relay>
switchover occurs when the feed switches because the stream terminates or a higher priority stream comes back online. Odd situations can be difficult to resolve, in this case it was if the higher priority hosts was available but suffers from a problem such as a timeout or just does not last long enough, like a short file. While the higher streams are rechecked every few seconds, I made it so that such failure lasted 10 minutes. The next bit is to make that more configurable in the xml. These are obviously separate from any fallback handling but works in a way not too dissimilar in a fallback overall.
Your description of the problem sounds more like a lock imbalance being triggered but not much to go on regarding the trigger event. There was a crash bug bug if auth was used but that does not sound like what you experienced. A lockup case might be identified if you grab core file so backtraces can be acquired (eg gcore
karl
The relay switchover is code when you define multiple hosts in a relay eg
<relay local-mount="/stream" on-demand="no"> <host priority="1" ip="127.0.0.1" port="12000" mount="/internal" /> <host priority="2" ip="192.168.1.10" port="7000" mount="/stream" /> </relay>
this is very good, because I also tried to relay from local (in one VPS) to make multi icecast (icecast linux + azuracast docker = icecast linux local relay from azuracast docker). now i will try again with kh21.6, thanks karl
@karlheyes @gunsar
Thanks guys for the support. Our relay settings looks like: `
<relay>
<server>10.1.1.1</server>
<port>80</port>
<mount>/radio1/mp3_128</mount>
<local-mount>/s/radio1/mp3_128</local-mount>
<retry-delay>10</retry-delay>
<on-demand>0</on-demand>
</relay>
<relay>
<server>10.1.1.2</server>
<port>80</port>
<mount>/radio1/mp3_128</mount>
<local-mount>/m/radio/mp3_128</local-mount>
<retry-delay>10</retry-delay>
<on-demand>0</on-demand>
</relay>
<mount>
<charset>ISO8859-1</charset>
<mount-name>/m/radio1/mp3_128</mount-name>
<fallback-mount>/s/radio1/mp3_128</fallback-mount>
<fallback-override>1</fallback-override>
<max-listener-duration>68400</max-listener-duration>
</mount>
<!-- endmount -->
`
I am not so familiar with gcore and gdb but I got the following output:
It seems that 5 threads were happened but no idea why...
What I also identified is that CPU usage with kh21.5 is approx. 20% higher than kh 13
Example with same hardware but different kh versions:
Many thanks for your feedback in advance.
the command is gdb /usr/bin/icecast core.xxxx then in gdb thread apply all bt there will be multiple threads, you have at least 2 workers, with a few other threads depending on the exact build and configuration and what the state was at the time, eg auth threads are create/destroyed on the fly. A clearer build would be one with make debug, it helps with the code references as optimization can mix things up. A lockup would show in such cases.
A lot of changes between kh13 and 21.5. unsure on big changes for the cpu on an IO bound app but some aspects can be heavier like xsl, or ssl. Might be good to check. Could be log related, maybe something is just heavy on a busy loop type of thing and needs adjusting.
you could do an strace -tt -o output -ff -p pidof icecast
ctrl-C after say 30 seconds, send the resulting output files to me.
karl.
Thanks for the strace, I think I have an idea on what is going on there, will look into that
karl.
Hi Karl, many thanks for your support :-)
I've committed a limiter change to the master tree for the client handling to prevent excessive processing by some client handlers which will be the cause of the CPU issue, it may need other changes as I haven't stressed it enough yet but it will be the bulk of it.
I'll be cutting a another pre-release shortly, when I get some feedback on something else, so another update will be out the next day or so. Although you can try the master tree if you want something right now.
karl
Hi Karl,
It sounds good for me. I will wait your new pre-release. Many thanks for your support.
Reproduction: