PredatH0r / ChanSort

TV channel list editor for Samsung, LG, Sony, Hisense, Panasonic, Philips, Sharp, Toshiba and MANY more.
837 stars 113 forks source link

Polish characters not shown correct - Philips 43PFS5505/12 #300

Closed ksz16 closed 1 year ago

ksz16 commented 2 years ago

After editing the list of channels from a cable TV (provider UPC) and importing them to the Philips 43PFS5505/12 TV set, the channel names with Polish characters are displayed incorrectly, as is the information on the currently playing channel and show. The problem is also that after some time (sometimes it's a week, sometimes a few weeks) the imported channel list returns to the operator's default order (while the incorrect display of Polish characters persists). This may be important information: I'm not using a decoder, but a CI+ module. The path to the file I edited: Clone\PhilipsChannelMaps\ChannelMap_100\ChannelList.bin I tried versions 2021-10-24, 2022-04-12 and 2022-04-19

I would appreciate it if you could take a look at the issue. I've been using ChanSort for several years successfully (on Samsung TVs, for example). It is really great software that saves a ton of time. It would be great if I could also use it correctly on the Philips 43PFS5505. I am not sure what kind of additional information is needed, but of course I am able to provide it.

Edit: forgot to add - I use reference list in txt format (EOL conversion Windows CR LF, encoding UTF-8-BOM)

PredatH0r commented 2 years ago

I have heard a similar problem from another Polish user who owns a Philips 65OLED856 (channel list version 115). The solution he found was to set the country to "Czech Republic" instead of "Poland", which then enables all the Polish characters too.

This seems to be a bug in the TV's firmware, which does not use the correct default code page when a name in the DVB data stream does not contain an explicit specifier for the character encoding and when the TV's country is set to Poland. ChanSort does not modify the "ChannelName" data in the file, unless the name was explicitly changed in the UI, so this is clearly a firmware issue. UPC could also fix this by adding encoding specifiers to the DVB channel names.

I could try and add such a specifier to the channel names when writing the file, but I have no Philips TV to test this with and I can't tell if that fixes the problem. If you ware willing to test it, I can prepare a special ChanSort version for you.

EDIT: I just reviewed the conversation and files I received about the broken Polish characters. The TV already exported bad data, so there is nothing I can do on the ChanSort end to reconstruct invalid names.

ksz16 commented 2 years ago

Thank you very much for your response and interest in the issue. I forgot to add that after exporting the data from the TV, when I try to import the list into ChanSort, I get an error message (but I have the option to ignore it and go to edit). I thought it had something to do with the TV's firmware, so I updated it to the latest version. Unfortunately, the message kept appearing. What is related to the fact that TV exports bad data? Is it related to the TV's firmware, or is it more to do with the specific settings of my provider UPC?

PredatH0r commented 2 years ago

I do not have any insights into the code of the Philips firmware, so I can only make deductions based on what I hear from users and the data I see in the exported channel lists - and I might be wrong with some of them. I also don't know the raw channel name data that UPC is transmitting in the DVB data stream. It is possible that UPC sends bad raw DVB data for channel names, or that UPC does not send encoding information and the TV makes a wrong guess or that they do send character encoding information and the TV is converting it incorrectly. It's also unclear whether the characters get already messed up when the TV receives the channel names as part of the DVB data, or later on when it exports the names to USB. And on top of that, some TVs (not sure about Philips) use multiple data sources for channel names, like the DVB data or HTTP data for EPG information.

On pages 101-103 of https://www.etsi.org/deliver/etsi_en/300400_300499/300468/01.11.01_60/en_300468v011101p.pdf you can see possible control sequences inside DVB channel names, including the encoding specifier at the beginning of the name.

Some TVs export the raw DVB names as described in that document, other TVs export already decoded and possibly re-encoded names as UTF-8, UTF-16 or as in the case of Philips, as hex-encoded UTF-16 - but unfortunately in some cases including incorrect data.

An example from a Polish UPC Philips .xml channel list is this: ChannelName="0x5A 0x00 0x61 0x00 0x6D 0x00 0xC2 0x00 0x6F 0x00 0x77 0x00 0x20 0x00 0x75 0x00 0x73 0x00 0x59 0x01 0x75 0x00 0x67 0x00 0x69 0x00 0x21 0x00 … where the 0xC2 0x00 in the middle of the name is the Unicode value 194 for "Â" and 0x59 0x01 is Unicode 345 for "ř". The whole name decodes to "ZamÂow usřugi!", but the correct name is "Zamów usługi!", which I found out by looking up that channel's service ID (SID) on the internet.

I did find the same garbled characters on this page: https://forum.mediaswiat.pl/viewtopic.php?p=127311 Unfortunately I can't read that thread. Someone in the beginning mentions LG, so I am wondering if this is a general UPC Poland issue affecting all TV brands, or if it is something Philips specific.

Anyhow... I was told by another polish Philips owner that setting the TV to country/language "Czech" and then performing a channel search resulted in correct polish characters in the channel names for him. The rest is unfortunately a lot of guess work.

ksz16 commented 2 years ago

This could be on the UPC side because I also have another TV (Samsung) with a similar problem - the solution is to select the Czech Republic as the country. But this Samsung model is not officially supported by CI+, which you can read about in UPC documentation. On the other hand, Philips 43PFS5505 is marked as correctly working in the list of supported UPC models. Anyway, the operation of setting the TV itself (after restoring to factory settings) is carried out correctly - both channel names and descriptions / informations are displayed correctly with Polish characters. The problem begins only after exporting the data to a USB, editing it with ChanSort and re-importing it into the TV set. I will try to set the country to the Czech Republic, because I admit that while I knew this method and used it with the Samsung, I have not yet tried it with the Phillips. However, if this trick did not help, would you be so kind and take a look at the data exported from the TV? I would archive and upload them if it could help anything of course.

PredatH0r commented 2 years ago

Thanks for the feedback. ChanSort does not make any modifications to the channel names unless you explicitly change the name in ChanSort's UI. The program code is so that it leaves the original ChannelName="..." XML attribute untouched when writing the file.

You can try to export the list on your TV and immediately import it back, without even opening it in ChanSort. I assume that will also change the characters. It would also mean that there is nothing I can do in ChanSort to "fix" the broken exported data. (And I assume it breaks during the export because the Unicode values in the exported file are incorrect)

ksz16 commented 2 years ago

Thank you very much for all your advice and suggestions. I will do tests at my leisure (maybe at the weekend) and share the results.

ksz16 commented 1 year ago

Sorry for the delay in responding. I finally found some time to run tests. Below are their results.

  1. If you export the channel list and import it again without any editing the characters are displayed incorrectly. So I confirm what you wrote previously - the problem is not related to ChanSort in any way.
  2. A workaround is to select Czech Republic as the country during the TV installation process. ChanSort displays an error message when importing the list (you can ignore it) but after editing and importing the channel list to the TV the characters are displayed correctly.

I think the issue can be closed. I don't know if it can be marked as solved but a workaround for the problem exists (I hope this will help other Polish users whose provider is UPC). Besides the problem is not related to the ChanSort itself, but has to do with the TV's firmware or with the settings of UPC (I rather bet on UPC because a similar problem occurs on a Samsung TV with the same provider).

Thank you very much for all your help in solving the problem and for your work on this very useful application.

ksz16 commented 1 year ago

I have one more question. Why does this message appear when importing a list into ChanSort?

ss_20220718_203854

Surely the problem is not a bad USB-Stick (I tried two different ones: SanDisk 16GB and SanDisk 32GB - all correctly formatted using GParted with msdos partition table and FAT32).

Also, what does the message "Disk /dev/mtdblock* doesn't contain a valid partition table" mean in the temp.log file that is created on the flash drive when exporting the channel list to USB? I attach the entire file.

temp.log

PredatH0r commented 1 year ago

Checksums are more or less "short" numbers that are calculated over the actual (possibly large) data set and stored along with the data. This is typically done so that any corruption in the file can be detected, by calculating the checksum again after reading the data and comparing it with the stored checksum.

Normally a bad checksum would indicate some sort of data corruption. In the case of Philips, it quite often happens that the TV seems to already store a wrong checksum when exporting a list. I can't tell if a wrong checksum is due to actual data corruption or because the TV decided to write a wrong checksum. Most Philips lists have correct checksums. But I have seen a couple with wrong ones. I can only guess that it is related to firmware updates without factory reset / new channel search afterwards. AFAIK after a new search the checksums will be correct again.

I don't know anything about that Log file. The TV internally runs a Linux operating system and that OS reported a possible issue it found. Maybe that message referred to your USB device or some other internal device of your TV