Plug-in Does not like special characters

iamwally commented 2 years ago

Special characters do not play well with the plug-in. If you have special characters in the title or artist files of the mp3 tag you will not get any output of RDS

ShadowLight8 commented 2 years ago

While I'm not 100% happy with this fix, it does get things to more or less work if special characters (unicode) are present. The limitation is the non-ascii characters will be ? in the RDS data. Going to close this for now, but feel free to reopen if a better solution is critically needed or it's still broken.

ShadowLight8 commented 2 years ago

Switched to excluding the special characters. Replacement lead to different and wrong characters appears in RDS displays.

ShadowLight8 commented 2 years ago

@iamwally Give this a version a try and see if you get reasonable characters on RDS now.

ShadowLight8 commented 2 years ago

I'll revisit this more with FPP 6.0 as the default install of python is now python 3. Python 3 handles unicode strings differently than Python 2 did.

TomasRa1 commented 1 year ago

Please, add and apply function ICONV into your code (for RDS string).

PHP simple example:

<?php
$return = iconv( 'UTF-8', 'ASCII//TRANSLIT', 'áéíóúüþëé');

Python equivalent source code:

import re, unicodedata
def asciize(str):
    str = unicode(str, 'UTF-8')
    str = unicodedata.normalize('NFKD', str.lower()).encode('ascii', 'ignore')
    str = re.sub('[^a-z0-9.]', ' ', str)
    str = re.sub('\s\s+', ' ', str)
    return str

Online test of ICONV function (in PHP): https://onlinephp.io/iconv

Thank you very much for update. Best regards Tomas Ransdorf, Czech republic

ShadowLight8 commented 1 year ago

Thanks! I'll work on getting this added in

TomasRa1 commented 1 year ago

I am sorry, I copied previous code for converting characters to ASCII from some web without to test it. I tested now and unfortunately it does not working. Seems to worked under Py2 and not under Py3. Please, accept my appology. I did new code by myself now - function in Python "AsciiConvert()" and tested sucesfully. It is localised for Middle Europe area, you can change characters in dictionary (under 'replace') as you need (according to the 'world place' where it will be used).

def AsciiConvert(text):
  replace = {'ě':'e','š':'s','č':'c','ř':'r','ž':'z','ý':'y','á':'a','í':'i','é':'e','ú':'u','ů':'u','Š':'S','Č':'C','Ř':'R','Ž':'Z','Ě':'E','Ý':'Y','Á':'A','É':'E','Ú':'U','Ů':'U','ň':'n','Ň':'N','Ä':'A','ä':'a','Ö':'O','ö':'o','Ü':'U','ü':'u','Ë':'E','È':'E','è':'e','ê':'e','ù':'u','ñ':'n','ì':'i','ç':'c','ô':'o','ś':'s','ź':'z'}
  replaced_chars = [replace.get(char, char) for char in text]
  output = ''.join(replaced_chars)
  return output

It is possible to call this function (for to test it):

input_txt = 'ěščřýáíéé=´)ůůúú'
output_txt = AsciiConvert(input_txt)
print("Converted text:", output_txt)

Otherwise, I am continuing with this your project by:

increase reliability of operation when I/O error occured on I2c bus (unfortunately due to RPi i2c bug)
change PWM output to PIN11(BCM17) for to possible use HiFiBerry DAC or On-board audio out
and of course also adapt this ASCII conversion to RDS strings

THANK YOU VERY MUCH FOR YOUR GREAT AND VERY NICE WORK, for your time and effort, which you put to this project!

ShadowLight8 commented 1 year ago

@TomasRa1 This is great. I'll try to get this pulled into the project soon. I'd love to know more about the other changes you've made. Do you have a public fork, repo, or be willing to open a pull request here?

TomasRa1 commented 1 year ago

Hello, my activities are in development and testing state now. PWM2 on pin11 with automatic setting duty cycle regarding web value without need any restart (with logging current duty cycle value set). It is fully functioning and tested now. I added to i2c write and read - execute TX-parameter-update when w/r function are unsucessful and TX-restart when w/r communication fatally stopped (for to try continue transmitting at all cases instead SW stopped). Also tested sucessfuly. I plan to test HiFiBerry DAC instead external USB DAC now. Flat cable between Rpi and TX module needs ferrite around for more reliable i2c communication, I would like also to test shielded cable for CLK and SDA wires. I checked that it is not easy for me to find the best place in py-code where change RDS strings with my AsciiConvert script. Maybe you can solve this problem better and much faster, because it is long revers engineering for me :-). When I will finish by all tests, I would like to share result (all updated files) to others, you can advise me how to do it, I do not know yet. I can send you updated files to your email if you have an interest. Regards Tomas

ShadowLight8 commented 1 year ago

Looking into this, it seems one issue might be in FPP. When it gets tags from the media files, it doesn't preserve unicode values. I've made a small change that seems to resolve that. Now I'm working on the plugin side to normalize the unicode to something that RDS can use.

TomasRa1 commented 1 year ago

Radio RDS decoder is able to display only 128 basic ASCII characters. FPP (and Python) supported natively all (a lot of) UTF-8 characters (and not Unicode). But in any case for correct displaying of tag characters on basic RDS receiver we need: All characters before transmition via RDS must be coded as basic ASCII 128 characters. When appeared any non ASCII basic code in RDS transmitting then such character is not displayed correctly on receiver. We must check all transmitted character (before RDS transmition) and if we find any which is out of 128 basic ASCII characters than we must replace it to basic ASCII character for to be displaed on receiver correctly.

ShadowLight8 commented 1 year ago

@TomasRa1 I think I got it working by using a few tweaks on your original suggestion (unicodedata.normalize was the key). I need to do some more testing before submitting the changes needed in FPP 7.2 directly, but here are the steps to get it working:

SSH in and make this change to /opt/fpp/src/mediadetails.cpp Change lines 103-104 to

    title = tag->title().toCString(true);
    artist = tag->artist().toCString(true);

Adding true tells toCString to keep Unicode characters as is Then from /opt/fpp/src, run make That will rebuild FPP with the change, then just restart FPPD from the web interface

The changes needed to callback.py are in https://github.com/ShadowLight8/Dynamic_RDS/pull/18/commits/961e49d02c042b30bef9f5a58afc3e11e8210cd7

I created a few test mp3 and ogg files with tags like: (Title) ěščřžýáíéúůŠČŘŽĚÝÁÉÚŮňŇÄäÖöÜüËÈèêùñìçôśź (Artist) 1À 2Á 3Â 4Ã 5Ä 6Å 7Ā 8Ă 9Ą 10Ǎ 11Ǟ 12Ǡ 13Ȁ 14Ȃ

Which get converted to (Title) escrzyaieuuSCRZEYAEUUnNAaOoUuEEeeunicosz (Artist) 1A 2A 3A 4A 5A 6A 7A 8A 9A 10A 11A 12A 13A 14A

TomasRa1 commented 1 year ago

You are brilliant with getting run the unicode library, it is the best solution. Thank you very much! I will also make tests of your genial solution during next days. I will confirm result soon, I hope at the end of this week.

ShadowLight8 commented 1 year ago

Created https://github.com/FalconChristmas/fpp/pull/1623 for the FPP change

TomasRa1 commented 1 year ago

Hello Nick, thank you very much for your activities.

I was try to test your new solution firstly on FPP6.3 : All your described updates done successfully. In modified file callbacks.py: when I comment # two new lines with "unicodedata.normalize" than Dynamic RDS plugin is functioning. when I uncomment these two new lines (spaces are correct) than Dynamic RDS plugin does not started. I reimaged SD card to FPP7.2 unfortunately with the same result.

It seems that library import of unicodedata is successful, problem is in execution line. Please, inform me which logs I can check or send to you. If plugin does not start than Dynamic_RDS_Engine.log is not created or updated. It can be also problem that something is differently configured between your and my FPP. Best regards Tomas

ShadowLight8 commented 1 year ago

Please post the last 50-100 lines of Dynamic_RDS_callbacks.log here. You can also try running callback.py from the SSH prompt to see if any errors pop up. Here's one of the things I run from the command line to test:

./callbacks.py --type media --data '{"Media":"Holly Jolly Christmas - Buble - Unicode Tag.ogg","Sequence":"","artist":"Michael Bubl\u00e9","bitrate":"431","channels":"2","length":"121","sampleRate":"44100","title":"Holly Jolly Christmas","track":"3","type":"media"}'

You might have to put sudoin front

ShadowLight8 commented 1 year ago

The unicode changes to FPP were merged into https://github.com/FalconChristmas/fpp/tree/master, so once the next version of FPP is released, then I can update this plugin so you can just download it at that point 👍

TomasRa1 commented 1 year ago

Hello Nick, thank you very much for your so nice help! Thanks you I found (my stupid) classic problem in Python - TabError: inconsistent use of tabs and spaces in indentation. Firstly I modified callbacks.py manually and put "spaces" as I habited from modifications of Dynamic_RDS_Engine.py Now I replaced firstly inserted spaces by 2 TABs placed before added two new lines with "unicodedata.normalize" and callbacks.py and whole plugin running properly now. I'm just not perfect... When I make a tests: sudo python callbacks.py --type media --data 'etc. than in Dynamic_RDS_callbacks.log translation to ASCII characters working perfectly. But in Dynamic_RDS_Engine.log are "Updated PS and RT Fragments" still reported (correctly) in original UTF-8 encoding (with diacritical marks) and to QN8066 are unfortunately send these strings also without ASCII translation. I do not understand where can be a problem, because strings flow could be: FPP -> callbacks.py / media_xxxx -> fifo.write -> Dynamic_RDS_Engine.py / rdsValues = {'{T}': '', '{A}': '', '{N}': '', '{L}': '', '{C}': ''} It is strange for me that on the end of callbacks.py is line: fifo.write('T' + media_title + '\n') because it finally functions in Dynamic_RDS_Engine.py as: 'T' = faticaly 'media_title' because when I configure RT Style Text on Dynamic_RDS.php page as "xyz {T}" than is transmitted by transmitter RT: xyz 'media_title' but it seems that it is not 'media_title' from the end of callbacks.py Accept my appology when my ideas are not correct :-) Best regards Tomas I forgot tell that I tested Dynamic RDS plugin with After Hours Music plugin connected to internet stream.

ShadowLight8 commented 1 year ago

Now that my Halloween show is done, I'll setup my FM RDS transmitter and see what the Engine is doing.

TomasRa1 commented 1 year ago

Please, check the Dynamic_RDS_Engine.log

In Dynamic_RDS_Engine.log I see tags with diacritical marks (original UTF-8 encoding = original encoding of tags) and not ASCII corrected tags, which I see in Dynamic_RDS_callbacks.log

Tags, which I see in Dynamic_RDS_Engine.log (unfortunately including non ASCII characters) are also physicaly transmitted on my setup.

Thank you for your the same test. If result will be the same, please try to correct.

Thank you for all.

ShadowLight8 commented 1 year ago

Ahh - After Hours Music plugin provides the title directly in the Engine, so it needed a similar update to was done in callbacks.py.

Take a look at the few changes in https://github.com/ShadowLight8/Dynamic_RDS/commit/8a552f6887e836ed3c1ac56ea447ca5d1ef790f2

ShadowLight8 commented 1 year ago

During testing with a German station, I saw "Weiße". While the Eszett, "ß", could be changed to "ss", this is a different problem to solve. I'm not sure how I would build up a list of substitutions OR if the same symbol could always be replaced the same way across other languages.

For purposes of this issue (at least for now), I'm going to focus on the letters with diacritical marks since they can be normalized to be included in the RDS output.

TomasRa1 commented 1 year ago

Hello Nick, you are wonderful that you found solution for MPC strings, thank you! In German language "ß" is accented "s". From my point of view translation to "ss" is correct. But if you would like to translate to only one "s" you can try to add "re.sub" function:


import re
str = 'ssysstem'
str = re.sub('ss', 's', str)
print(str)

# or pre:
str = 'ßyßtem'
str = re.sub('ß', 's', str)
print(str)

ShadowLight8 commented 1 year ago

After doing some Internet Research, I think I'll hold off on other substitutions, like the Eszett, for now. I've started looking at https://pypi.org/project/Unidecode/ as an option. They also call out some challenges with these types of swaps.

I've done some refactoring and have the unicode normalization in one place now just before it gets processed into chunks for being sent as RDS. This removed the changes from callback.py. Take a look at https://github.com/ShadowLight8/Dynamic_RDS/commit/7e4794acbf3a56cbd3ae66d2b1c4ac5775b369ec

I'm going to do some more testing over the weekend.

ShadowLight8 commented 1 year ago

FPP 7.3 is out from a few days ago. I'm merged these and a bunch of other changes into main, so you should be able to update to FPP 7.3, then update Dynamic_RDS and be good to go. I'm going to close this Issue, but feel free to comment here or on others or open new issues.

ShadowLight8 / Dynamic_RDS

Plug-in Does not like special characters #2