bodograumann / python-iconv

Python 3 wrapper for iconv and usage as codecs
GNU General Public License v3.0
7 stars 2 forks source link

ASCII//TRANSLIT not found with python 3.9 #4

Open bodograumann opened 3 years ago

bodograumann commented 3 years ago

In python 3.9 the ASCII//TRANSLIT codec cannot be loaded anymore:

> python -m unittest
.....EE
======================================================================
ERROR: test_incremental_encode (test_iconvcodec.TestIconvcodecModule)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/bodo/Libraries/python-iconv/test_iconvcodec.py", line 21, in test_incremental_encode
    encoder = codecs.getincrementalencoder("ASCII//TRANSLIT")()
  File "/usr/lib/python3.9/codecs.py", line 986, in getincrementalencoder
    encoder = lookup(encoding).incrementalencoder
LookupError: unknown encoding: ASCII//TRANSLIT

======================================================================
ERROR: test_transliterate (test_iconvcodec.TestIconvcodecModule)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/bodo/Libraries/python-iconv/test_iconvcodec.py", line 17, in test_transliterate
    bytestring = string.encode("ASCII//TRANSLIT")
LookupError: unknown encoding: ASCII//TRANSLIT

----------------------------------------------------------------------
Ran 7 tests in 0.003s

FAILED (errors=2)
bodograumann commented 3 years ago

This is due to a change in python codecs handling from version 3.8 to 3.9, where codec names are normalized. Cf. https://bugs.python.org/issue37751

bodograumann commented 3 years ago

Submitted a new bug report: https://bugs.python.org/issue44723

bodograumann commented 2 years ago

Another related report: https://bugs.python.org/issue46508

Javrd commented 3 months ago

I use this workaround to get it working in case anyone interested:

import codecs
import iconvcodec
import encodings

# Hack alias for resolving https://github.com/bodograumann/python-iconv/issues/4
# as charset names are normalized
def normalized_alias(charset):
    codec = 'ASCII//TRANSLIT' # It works also for other codecs that contains normalized characters like '-' or ':'
    result = None
    if charset == encodings.normalize_encoding(codec.lower()):
        result = iconvcodec.lookup(codec)
    return result

codecs.register(normalized_alias)