Closed FizzyTea closed 1 year ago
Further investigation reveals some curious results
Running the cmd tv_grab_uk_tvguide --list-channels > channels.xml
results in an xml file with the problematic characters correctly displayed e.g.
<channel id="1305.tvguide.co.uk">
<display-name lang="en">RTÉ One</display-name>
<icon src="https://cdn.tvguide.co.uk/channel_logos/60x35/1305.png" />
<url>https://www.tvguide.co.uk/channellistings.asp?ch=1305</url>
</channel>
<channel id="1306.tvguide.co.uk">
<display-name lang="en">RTÉ One +1</display-name>
<icon src="https://cdn.tvguide.co.uk/channel_logos/60x35/1306.png" />
<url>https://www.tvguide.co.uk/channellistings.asp?ch=1306</url>
</channel>
<channel id="719.tvguide.co.uk">
<display-name lang="en">RTÉ One +1</display-name>
<icon src="https://cdn.tvguide.co.uk/channel_logos/60x35/719.png" />
<url>https://www.tvguide.co.uk/channellistings.asp?ch=719</url>
</channel>
<channel id="1307.tvguide.co.uk">
<display-name lang="en">RTÉ One HD</display-name>
<icon src="https://cdn.tvguide.co.uk/channel_logos/60x35/1307.png" />
<url>https://www.tvguide.co.uk/channellistings.asp?ch=1307</url>
</channel>
<channel id="900.tvguide.co.uk">
<display-name lang="en">RTÉ One HD</display-name>
<icon src="https://cdn.tvguide.co.uk/channel_logos/60x35/900.png" />
<url>https://www.tvguide.co.uk/channellistings.asp?ch=900</url>
</channel>
However (removing cache for the sake of caution and moving previously generated config files and) running tv_grab_uk_tvguide --configure
results in a configuration file with malformed Channel names e.g.
channel!1519 # RT� 2 +1
channel!716 # RT� Jr
channel!342 # RT� One
channel=1305 # RT� One
channel!1306 # RT� One +1
channel!719 # RT� One +1
channel=1355 # RT� One HD
channel=900 # RT� One HD
channel=1307 # RT� One HD
channel!1236 # RT� Radio 1 FM
channel!1237 # RT� Raidi� na Gaeltachta
channel!1235 # RT� lyric fm
channel=363 # RT�2
channel=718 # RT�2 HD
And of course upon running the grabber the results are similarly problematic e.g.
<channel id="1305.tvguide.co.uk">
<display-name lang="en">RT� One</display-name>
</channel>
<channel id="1355.tvguide.co.uk">
<display-name lang="en">RT� One HD</display-name>
</channel>
<channel id="342.tvguide.co.uk">
<display-name lang="en">RT� One</display-name>
</channel>
<channel id="363.tvguide.co.uk">
<display-name lang="en">RT�2</display-name>
</channel>
<channel id="718.tvguide.co.uk">
<display-name lang="en">RT�2 HD</display-name>
</channel>
<channel id="900.tvguide.co.uk">
<display-name lang="en">RT� One HD</display-name>
</channel>
I suspect the problem may well be at my end but I am at a loss to solve this issue so any help is much appreciated.
Thanks for the detailed report.
Before I commit a change to git do you want to check it out for me, please?
Find your tv_grab_uk_tvguide on your RPi and change line line 711 from
$channels->{$channel_id} = { 'id'=> $xmlchannel_id , 'display-name' => [[$channelname, 'en']] };
to
$channels->{$channel_id} = { 'id'=> $xmlchannel_id , 'display-name' => [[ encode('utf-8', $channelname), 'en' ]] };
.
p.s. locale GB.UTF-8 should be fine
p.p.s. the channelname may display wrong in the config file, or not: it depends on your system. I guess I should fix that. I never tested this with Irish channels, since this is (notionally) a "UK" grabber ;-)
That change has fixed the described issues. Thanks very much.
p.s. the issue also affects one or two (at least one) BBC Wales channels.
p.p.s. I have some further validation issues. Do not think they are connected to this issue though so I should probably make a separate thread if I can't resolve them.
Thanks for taking the time to report an issue. Please take a moment to review our open/closed issues above, in case your issue has already been reported.
If you are reporting a new issue, please give your issue a descriptive title and fill out the blanks below, providing as much information as possible.
XMLTV Version?
XMLTV module version 1.1.2
XMLTV Component?
tv_grab_uk_tvguide version 1.1.2
Perl Version
Perl v5.28.1
Operating System
Raspbian 10 Buster
What happened?
Grabber appears to run successfully but the xml file does not validate. Irish channel names such as RTÉ One and RTÉ2 are not correctly displayed.
On running the grabber with
--configure
I notice a problem with such channel names. See attached screenshot. Upon inspecting the xml file from a seemingly successful grab I notice a similar problem though the channel names are displayed differently to before. See attached screenshot.What did you expect to happen?
I expect the channel names to be correctly displayed in the console and the xml.
Did you see any warnings/errors?
I get the following error running
tv_validate_file listings.xml
What steps are needed to reproduce this issue?
(Please provide the full commands you are running)
Please attach your config file below:
``
Any other information?
I wonder if this is related to the encoding system on my OS or in my console? I ssh into my Raspberry Pi from Ubuntu 20.04 (XFCE). Initially my rpi locale was set to en_GB.UTF-8. After encountering issues I changed my locale to en_IE.UTF-8 and went through the configuration process again (after removing the cached files from previous runs) but the problems remain.
…