P1sec / pycrate

A Python library to ease the development of encoders and decoders for various protocols and file formats; contains ASN.1 and CSN.1 compilers.
GNU Lesser General Public License v2.1
381 stars 132 forks source link

Use utf-8 encoding for ASN.1 files #207

Closed mbrehler closed 2 years ago

mbrehler commented 2 years ago

Set file encoding explicitly to utf-8 for ASN.1 files to fix compile issues in Windows. See https://docs.python.org/3/using/windows.html#utf-8-mode : Windows uses legacy encodings by default leading to unrecognized characters with utf-8 encoded files.

Change-Id: I097ce3d6836f0ec4fcfe2a37082ecc675f23b0e3

mitshell commented 2 years ago

If we want a compile pass that generates the same source as the current one, we need to use Python 2 : python2 -m pycrate_asn1c.asnproc should do the job. As pycrate as been made to support both py 2 and 3, ASN.1 modules require all their strings to be explicit unicode ones for proper python 2 support (which is not required by python 3, as all strings are unicode by default, while byte string requires another explicit notation). I know this is more and more questionable to keep supporting Python 2 in 2022, but I guess I am quite conservative here ! I can pull this PR, and regenerate ASN.1 modules on my side before merging, if you want.

mbrehler commented 2 years ago

Sorry, I didn't mean to include 03f272e (the Windows changed files) in the pull request. That output of the compiler is possibly different across OS and python versions is a problem. for another day (I suspect from 3.6 on it's all ok due to dict ordering changes but that's just a hypothesis).

I meant to only ask for 679fe24 which at least makes the files you ship compile in Windows (detailed problem from mainline below). I see that I modify the pull request to reflect this.

As for Python 2.x: I think you should drop it: Support stopped in 2020, cf https://www.python.org/doc/sunset-python-2/ so time to move on but clearly others may see this differently. I'd expect all the above to hold for 2.7 but haven't verified.

Adding 3.10 support (that's what I use on Windows) would be good to add in test environment.

Python 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import pycrate.pycrate_asn1c.asnproc pycrate.pycrate_asn1c.asnproc.generate_all() [...skip lines...] [proc] ASN.1 modules processed: ['PKIX1Explicit88', 'PKIX1Implicit88', 'PKIXAttributeCertificate', 'AttributeCertificateVersion1', 'CryptographicMessageSyntax2004', 'CMS-AuthEnvelopedData-2007'] [proc] ASN.1 objects compiled: 219 types, 0 sets, 152 values [proc] done [GEN] CAP Traceback (most recent call last): File "C:\Users\mbrehler\Data\work\5G\pycrate-issues\pycrate\pycrate_asn1c\asnproc.py", line 131, in get_spec_files spec_texts.append( fd.read() ) File "c:\Program Files\Python\3.10.4\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 12308: character maps to