arthurdejong / python-stdnum

A Python library to provide functions to handle, parse and validate standard numbers.
https://arthurdejong.org/python-stdnum/
GNU Lesser General Public License v2.1
498 stars 206 forks source link

feature request for adding custom replacement characters #150

Closed kkaiser closed 5 years ago

kkaiser commented 5 years ago

In stdnum.utils._clean_chars you use a _char_map to replace unicode chars. I was wondering if you can make _char_map an optional argument for _clean_chars so that I could add some more symbols. In this case I would like it for apostrophes:

    'APOSTROPHE,GRAVE ACCENT,ACUTE ACCENT,MODIFIER LETTER RIGHT HALF RING,'
    'MODIFIER LETTER LEFT HALF RING,MODIFIER LETTER PRIME,'
    'MODIFIER LETTER TURNED COMMA,MODIFIER LETTER APOSTROPHE,'
    'MODIFIER LETTER VERTICAL LINE,COMBINING GRAVE ACCENT,'
    'COMBINING ACUTE ACCENT,COMBINING TURNED COMMA ABOVE,'
    'COMBINING COMMA ABOVE,ARMENIAN APOSTROPHE,'
    'SINGLE HIGH-REVERSED-9 QUOTATION MARK,LEFT SINGLE QUOTATION MARK,'
    'RIGHT SINGLE QUOTATION MARK':
        "'",

If you think that is useful you can also just add it but maybe others would want something similar if they deal with that type of data.

arthurdejong commented 5 years ago

The _clean_chars() function is part of the internal API of the util module and not really meant for consumption, except for through the clean() function (which in turn is part of the internap API for python-stdnum.

There are other libraries that can deal with general unicode to ascii conversion such as Unidecode.

I did however add your above list in ad96b15.