srcML / pylibsrcml

Python bindings for libsrcml
GNU General Public License v3.0
12 stars 1 forks source link

Unable to change source encoding to UTF-8 via `srcml.set_src_encoding()` #6

Open z33kz33k opened 1 month ago

z33kz33k commented 1 month ago

This happens:

>>> srcml.set_src_encoding("UTF-8")
Traceback (most recent call last):
  File "C:\Users\{user}\AppData\Local\Programs\Python\Python37\lib\code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "C:\{some_path}\.temp\venv\lib\site-packages\pylibsrcml\globals.py", line 299, in set_src_encoding
    check_return(libsrcml.srcml_set_src_encoding(str.encode(encoding)))
  File "C:\{some_path}\.temp\venv\lib\site-packages\pylibsrcml\exception.py", line 19, in check_return
    raise srcMLException("Recieved invalid return status: " + str(value))
pylibsrcml.exception.srcMLException: Recieved invalid return status: None

Trying to do the same via the CLI client works correctly (and the resulting XML has all characters properly encoded):

PS C:\{some_path}\srcml\test4> srcml --src-encoding="UTF-8" .\SIL_Can.c -o .\SIL_Can.c.xml
PS C:\{some_path}\srcml\test4>

As per this issue the default srcML's source encoding is ISO-8859-1. So, if in my sources there happens to be an author with a foreign name and there's a comment saying that, I can't really use pylibsrcml and would need probably to puppet the CLI client with subprocess... :/

I'm on Windows 10, Python 3.7, srcML version:

srcml 1.0.0
libsrcml 1.0.0
libarchive 3.3.2

and pylibsrcml 1.0.0

z33kz33k commented 1 month ago

Well, I've just read #5 and thought I'd try to:

with contextlib.suppress(pylibsrcml.exception.srcMLException):
    srcml.set_src_encoding("UTF-8")

Lo and behold. It works. So, I guess somebody was a bit overzealous with all those checks :)