karlheyes / icecast-kh

KH branch of icecast
GNU General Public License v2.0
300 stars 107 forks source link

UTF8 data probably incorrect charcter set #411

Closed onur58 closed 1 year ago

onur58 commented 1 year ago

Hi

We are facing some charset issues with 2.4.0 KH20 (v.19)

It doesn't display Umlaute like "ä,ö,ü," and french characters etc.:

WARN stats/stats_set_conv seen non-UTF8 data, probably incorrect charcter set (title, Radio Musikwelle - Die schnsten Melodien fr Sie

Wrong title: title, Radio Musikwelle - Die schnsten Melodien fr Sie Correct title should look like: title, Radio Musikwelle - Die schönsten Melodien für Sie

I already configured the charset modification in mount settings like this example but it doesn't help:

/r2/radio2/mp3_128 /r1/radio2/mp3_128 1 UTF-8

Do you have any idea?

Many thanks for your support :-)

karlheyes commented 1 year ago

do you know what charset the incoming title is in? the default is to assume utf8 but if you are providing something different and there is no recognised method of notification or it being different then you can have this type of problem.

determine if it is a relay or source and what the content is there. Obviously utf8 allows for any but historically many have assumed something like latin1

karl.

onur58 commented 1 year ago

Hi Karl

Many thanks for your feedback. It's a relay server which connects to source Icecast Server in different location. We have same setup and config with Icecast 2.4.0 Kh10 and it is working fine without adding charset in the config for mp3 and AAC streams. Therefore it's really strange for us. Here is an example output from config file:

`

_IP of Source Server_
            <port>8000</port>
            <mount>/rr/mp3_128</mount>
            <local-mount>/r1/rr/mp3_128</local-mount>
            <retry-delay>10</retry-delay>
    </relay>
            <relay>
            <server>_IP of Source Server_</server>
            <port>8000</port>
            <mount>/rr/mp3_128</mount>
            <local-mount>/r2/rr/mp3_128</local-mount>
            <retry-delay>10</retry-delay>
    </relay>
            <mount>
            <mount-name>/r2/rr/mp3_128</mount-name>
            <fallback-mount>/r1/rr/mp3_128</fallback-mount>
            <fallback-override>1</fallback-override>
            _ Tested with <charset>ISO8859-1</charset> and <charset>UTF-8</charset> --> got the same result _
    </mount>`

Should we modify on the source server or somewhere else in the config file?

Many thanks for your support.

karlheyes commented 1 year ago

assuming kh10 was fine then use a charset of ISO8859-1 as that was the default at that time. The metadata was never changed via the icy block so that would be the same. What platform is this on?

It may also be possible that the source client is sending a charset setting in the metadata url request which overrides the default mount setting.

karl

onur58 commented 1 year ago

Hi Karl, Sorry for my late response (happy easter! :-)) If we assume that default charset for previous versions from kh10 was ISO8859-1 then it should be reproducible. I extended the charset settings with <charset>ISO8859-1</charset> but it was the same result.

Our Setup which works fine so far:

(1) MUXIP Signal Encoder for AAC and MP3 (live streams) ---> (2) Source Master with local mounts like stream1.aac and stream1.mp3 (Icecast 2.4.4 not kh on Linux) ---> (3) Master relay Server with relay mounts from Source Master (2.4.0 kh10 on Linux) ---> (4) 10x Icecast Edge Server, we named "LSA CDN" relaying from Master relay Server (2.4 kh10 on Linux)

What I did until today was to update point (3) and point (4) to version 2.4.0 KH20 with charset settings to ISO8859-1 and also to UTF-8 but the result was the same "WARN stats/stats_set_conv seen non-UTF8 data, probably incorrect charcter set" i got this message also with ISO8859-1 charset.

Therefore I assume that the charset settings will be not applied on the point (3) and (4). What would you do as next? :-) or where can I verify the incomming ICY metadata charset?

Many thanks again for your great support Karl.

BR, Onur

karlheyes commented 1 year ago

just to get some clarification. At which point do you insert metadata and how is that configured wrt charset?

I suspect if latin1 is being injected at 2.4.4 and set to expect latin1 then latin1 is coming via icy to point 3. I'll have to check what 2.4.4 does exactly in such cases. kh20 is probably not expecting that from the icy block.

karl

onur58 commented 1 year ago

Hi Karl,

The source Master point (1) is using latin1 per Default in 2.4.4 (not KH). So that would mean that our Master Relay (3) and Icecast Edges (4) assuming UTF-8 per default 2.4.0 KH20 and can not handle it. So my main question is, where and how can I change the charset for our Master Relay (3) and Icecast Edges (4) the default charset from UTF-8 to latin1? I found this entry https://github.com/karlheyes/icecast-kh/issues/249#issuecomment-614984263 from you but it's not clearly defined (sorry I am not a developer) :-)

I tested with <charset>ISO8859-1</charset> in mount section of Master Relay (3) and Icecast Edges (4). I will also share the icecast.xml config of all nodes here:

`Icecast Source Master (Point 2)

1000 150 5 524288 30 15 10 1 65535 xyz xyz xyz xyz hostname of the Source Master Icecast xyz 8000 127.0.0.1 8000 IP of the Source Master Icecast /stream1 4 10.153.5.244 8000 /stream1_Room1 /stream1/mp3_128 Icecast Master Relay (Point 3) xyz streaming.xyz servicedesk@xyz 80 xyz xyz xyz 3700 1000 10 10 76459 0 xyz xyz /var/log/icecast /usr/share/icecast/web /usr/share/icecast/admin /var/run/icecast.pid access.log error.log 3 10240 IP of primary source Server 8000 /stream1/mp3_128 /r1/stream1/mp3_128 10 IP of secondary source Server 8000 /stream1/mp3_128 /r2/stream1/mp3_128 10 /r2/stream1/mp3_128 /r1/stream1/mp3_128 1 ISO8859-1 -->not working Icecast Edge Server (Point 4) xyz streaming.xyz servicedesk@xyz 80 443 1 xyz xyz xyz 8000 1000 5 5 76459 0 xyz xyz /var/log/icecast /usr/share/icecast/web /usr/share/icecast/admin /var/run/icecast.pid /etc/pki/tls/certs/xyz.pem access.log error.log 3 10240 Ip of primary Icecast Master Relay 80 /r1/stream1/mp3_128 /s/stream1/mp3_128 10 0 Ip of secondary Icecast Master Relay 80 /r2/stream1/mp3_128 /m/stream1/mp3_128 10 0 /m/stream1/mp3_128 /s/stream1/mp3_128 1 68400 ISO8859-1 -->not working `
karlheyes commented 1 year ago

do you have a stream I can relay locally to check against?

karl

onur58 commented 1 year ago

Hi Karl, you should receive an e-mail with the test streams.

Many thanks for your support.

BR, Onur

karlheyes commented 1 year ago

ok, that looks to be fixed now. I've committed that fix into the master tree. Broke in kh17 from the look of it, but was a minor fix

onur58 commented 1 year ago

many thanks for the minor fix Karl, but it's the same for AAC stream. Your change will fix it also for AAC streams charset and will you create a new version like kh21 ?

karlheyes commented 1 year ago

aac and mp3 are handled in the same routine, so the charset handling issue would apply in the same way. Im waiting on feedback from something else, and if that is ok then kh21 will be released

karl

onur58 commented 1 year ago

Hi Karl, that sounds really good :-) Many thanks again for your support.

karlheyes commented 1 year ago

kh21 is up

karl.

onur58 commented 1 year ago

Hi Karl,

Many thanks for fixing this issue. It works with latin1 charset on the Icecast Edge mounts.

BR, Onur