joshua-auchincloss / hatch-cython

cython hooks for hatch
MIT License
27 stars 5 forks source link

bug: UnicodeDecodeError on non-utf locales #56

Open Puiching-Memory opened 5 months ago

Puiching-Memory commented 5 months ago

The compiler I use has output with non-English characters, which causes problems when decoding with utf8.

PS D:\GitHub\pylibde265> hatch build     
──────────────────────────── sdist ────────────────────────────
dist\pylibde265-0.0.1.tar.gz
──────────────────────────── wheel ────────────────────────────
[cython]
pre-build artifacts
attempted to use .pxd file without .py file (./src/pylibde265/decode.pxd)
Building c/c++ extensions...
['./src/pylibde265/__init__.py', './src/pylibde265/decode.pxd', './src/pylibde265/decode.pyx']
attempted to use .pxd file without .py file (./src/pylibde265/decode.pxd)
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main      
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\Lib\site-packages\hatchling\__main__.py", line 6, in <module>
    sys.exit(hatchling())
             ^^^^^^^^^^^
  File "C:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\Lib\site-packages\hatchling\cli\__init__.py", line 26, in hatchling
    command(**kwargs)
  File "C:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\Lib\site-packages\hatchling\cli\build\__init__.py", line 82, in build_impl
    for artifact in builder.build(
  File "C:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\Lib\site-packages\hatchling\builders\plugin\interface.py", line 147, in build
    build_hook.initialize(version, build_data)
  File "C:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\Lib\site-packages\hatch_cython\plugin.py", line 323, in initialize
    self.build_ext()
  File "C:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\Lib\site-packages\hatch_cython\plugin.py", line 303, in build_ext
    stdout = process.stdout.decode("utf-8")
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 2543: invalid continuation byte

This should happen when the error output is caught.

When I changed the code here, the problem was alleviated. I haven't been tested carefully. hatch_cython/plugin.pyLine:303 stdout = process.stdout.decode("utf-8")-> stdout = process.stdout.decode("gbk")

joshua-auchincloss commented 5 months ago

Hey @Puiching-Memory, thanks for filing the bug report! Quick question for reproducibility, what compiler does your system use? Looks like a simplified Chinese (GBK) compatibility issue, as-is I don't think we can swap the locale out globally, so there will have to be some additional logic to detect & utilize system localizations (low effort, targeting 0.6.0rc1 for the fix).

Puiching-Memory commented 5 months ago

Hey @Puiching-Memory, thanks for filing the bug report! Quick question for reproducibility, what compiler does your system use? Looks like a simplified Chinese (GBK) compatibility issue, as-is I don't think we can swap the locale out globally, so there will have to be some additional logic to detect & utilize system localizations (low effort, targeting 0.6.0rc1 for the fix).

Yes, I am using Windows in simplified Chinese. When I use gbk encoding, it produces this output, which contains some Chinese characters.

PS D:\GitHub\pylibde265> & C:/Users/1138/.conda/envs/265/python.exe d:/GitHub/pylibde265/tools_build.py
────────────────────────── sdist ───────────────────────────
dist\pylibde265-0.0.1.tar.gz
────────────────────────── wheel ───────────────────────────
[cython]
pre-build artifacts
attempted to use .pxd file without .py file (./src/pylibde265/decode.pxd)
Building c/c++ extensions...
['./src/pylibde265/decode.pyx', './src/pylibde265/decode.pxd', './src/pylibde265/__init__.py']
attempted to use .pxd file without .py file (./src/pylibde265/decode.pxd)
running build_ext
building 'pylibde265.decode' extension
creating C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release
creating C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\src
creating C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\src\pylibde265
"D:\Visual Studio\VC\Tools\MSVC\14.40.33807\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -I./src/pylibde265 -IC:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\include -ID:\anaconda\include -ID:\anaconda\Include "-ID:\Visual Studio\VC\Tools\MSVC\14.40.33807\include" "-ID:\Visual Studio\VC\Tools\MSVC\14.40.33807\ATLMFC\include" "-ID:\Visual Studio\VC\Auxiliary\VS\include" "-ID:\Windows Kits\10\include\10.0.22621.0\ucrt" "-ID:\Windows Kits\10\\include\10.0.22621.0\\um" "-ID:\Windows Kits\10\\include\10.0.22621.0\\shared" "-ID:\Windows Kits\10\\include\10.0.22621.0\\winrt" "-ID:\Windows Kits\10\\include\10.0.22621.0\\cppwinrt" /Tc./src/pylibde265/decode.c /FoC:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\./src/pylibde265/decode.obj -O2
decode.c
./src/pylibde265/decode.c(7952): warning C4244: “=”: 从“Py_ssize_t”转换到“long”,可能丢失数据
creating C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\build\pylibde265
"D:\Visual Studio\VC\Tools\MSVC\14.40.33807\bin\HostX86\x64\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:D:/GitHub/pylibde265/src/pylibde265/lib /LIBPATH:C:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\libs /LIBPATH:D:\anaconda\libs /LIBPATH:D:\anaconda /LIBPATH:C:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\PCbuild\amd64 "/LIBPATH:D:\Visual Studio\VC\Tools\MSVC\14.40.33807\ATLMFC\lib\x64" "/LIBPATH:D:\Visual Studio\VC\Tools\MSVC\14.40.33807\lib\x64" "/LIBPATH:D:\Windows Kits\10\lib\10.0.22621.0\ucrt\x64" "/LIBPATH:D:\Windows Kits\10\\lib\10.0.22621.0\\um\x64" de265.lib /EXPORT:PyInit_decode C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\./src/pylibde265/decode.obj /OUT:C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\build\pylibde265\decode.cp312-win_amd64.pyd /IMPLIB:C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\./src/pylibde265\decode.cp312-win_amd64.lib
  正在创建库 C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\./src/pylibde265\decode.cp312-win_amd64.lib 和对象 C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\./src/pylibde265\decode.cp312-win_amd64.exp
正在生成代码
已完成代码的生成
building 'pylibde265.__init__' extension
"D:\Visual Studio\VC\Tools\MSVC\14.40.33807\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\include -ID:\anaconda\include -ID:\anaconda\Include "-ID:\Visual Studio\VC\Tools\MSVC\14.40.33807\include" "-ID:\Visual Studio\VC\Tools\MSVC\14.40.33807\ATLMFC\include" "-ID:\Visual Studio\VC\Auxiliary\VS\include" "-ID:\Windows Kits\10\include\10.0.22621.0\ucrt" "-ID:\Windows Kits\10\\include\10.0.22621.0\\um" "-ID:\Windows Kits\10\\include\10.0.22621.0\\shared" "-ID:\Windows Kits\10\\include\10.0.22621.0\\winrt" "-ID:\Windows Kits\10\\include\10.0.22621.0\\cppwinrt" /Tc./src/pylibde265/__init__.c /FoC:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\./src/pylibde265/__init__.obj -O2
__init__.c
"D:\Visual Studio\VC\Tools\MSVC\14.40.33807\bin\HostX86\x64\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:D:/GitHub/pylibde265/src/pylibde265/lib /LIBPATH:C:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\libs /LIBPATH:D:\anaconda\libs /LIBPATH:D:\anaconda /LIBPATH:C:\Users\1138\AppData\Local\hatch\env\virtual\pylibde265\yLEma8IH\pylibde265-build\PCbuild\amd64 "/LIBPATH:D:\Visual Studio\VC\Tools\MSVC\14.40.33807\ATLMFC\lib\x64" "/LIBPATH:D:\Visual Studio\VC\Tools\MSVC\14.40.33807\lib\x64" "/LIBPATH:D:\Windows Kits\10\lib\10.0.22621.0\ucrt\x64" "/LIBPATH:D:\Windows Kits\10\\lib\10.0.22621.0\\um\x64" de265.lib /EXPORT:PyInit___init__ C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\./src/pylibde265/__init__.obj /OUT:C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\build\pylibde265\__init__.cp312-win_amd64.pyd /IMPLIB:C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\./src/pylibde265\__init__.cp312-win_amd64.lib
  正在创建库 C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\./src/pylibde265\__init__.cp312-win_amd64.lib 和对象 C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\tmp\Release\./src/pylibde265\__init__.cp312-win_amd64.exp
正在生成代码
已完成代码的生成
copying C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\build\pylibde265\decode.cp312-win_amd64.pyd -> src\pylibde265      
copying C:\Users\1138\AppData\Local\Temp\tmpeckmqedz\build\pylibde265\__init__.cp312-win_amd64.pyd -> src\pylibde265    

Post-build artifacts
['./src/pylibde265\\lib\\', './src/pylibde265\\lib\\de265.exp', './src/pylibde265\\lib\\de265.lib', './src/pylibde265\\lib\\libde265.dll', './src/pylibde265\\lib\\libde265.pdb', './src/pylibde265\\libde265\\', './src/pylibde265\\libde265\\acceleration.h', './src/pylibde265\\libde265\\alloc_pool.cc', './src/pylibde265\\libde265\\alloc_pool.h', './src/pylibde265\\libde265\\arm', './src/pylibde265\\libde265\\arm\\arm.cc', './src/pylibde265\\libde265\\arm\\arm.h', './src/pylibde265\\libde265\\arm\\asm.S', './src/pylibde265\\libde265\\arm\\cpudetect.S', './src/pylibde265\\libde265\\arm\\hevcdsp_qpel_neon.S', './src/pylibde265\\libde265\\arm\\Makefile.am', './src/pylibde265\\libde265\\arm\\Makefile.in', './src/pylibde265\\libde265\\arm\\neon.S', './src/pylibde265\\libde265\\bitstream.cc', './src/pylibde265\\libde265\\bitstream.h', './src/pylibde265\\libde265\\cabac.cc', './src/pylibde265\\libde265\\cabac.h', './src/pylibde265\\libde265\\CMakeLists.txt', './src/pylibde265\\libde265\\configparam.cc', './src/pylibde265\\libde265\\configparam.h', './src/pylibde265\\libde265\\contextmodel.cc', './src/pylibde265\\libde265\\contextmodel.h', './src/pylibde265\\libde265\\COPYING', './src/pylibde265\\libde265\\de265-version.h', './src/pylibde265\\libde265\\de265-version.h.in', './src/pylibde265\\libde265\\de265.cc', './src/pylibde265\\libde265\\de265.h', './src/pylibde265\\libde265\\deblock.cc', './src/pylibde265\\libde265\\deblock.h', './src/pylibde265\\libde265\\decctx.cc', './src/pylibde265\\libde265\\decctx.h', './src/pylibde265\\libde265\\dpb.cc', './src/pylibde265\\libde265\\dpb.h', './src/pylibde265\\libde265\\en265.cc', './src/pylibde265\\libde265\\en265.h', './src/pylibde265\\libde265\\encoder', './src/pylibde265\\libde265\\encoder\\algo', './src/pylibde265\\libde265\\encoder\\algo\\algo.cc', './src/pylibde265\\libde265\\encoder\\algo\\algo.h', './src/pylibde265\\libde265\\encoder\\algo\\cb-interpartmode.cc', './src/pylibde265\\libde265\\encoder\\algo\\cb-interpartmode.h', './src/pylibde265\\libde265\\encoder\\algo\\cb-intra-inter.cc', './src/pylibde265\\libde265\\encoder\\algo\\cb-intra-inter.h', './src/pylibde265\\libde265\\encoder\\algo\\cb-intrapartmode.cc', './src/pylibde265\\libde265\\encoder\\algo\\cb-intrapartmode.h', './src/pylibde265\\libde265\\encoder\\algo\\cb-mergeindex.cc', './src/pylibde265\\libde265\\encoder\\algo\\cb-mergeindex.h', './src/pylibde265\\libde265\\encoder\\algo\\cb-skip.cc', './src/pylibde265\\libde265\\encoder\\algo\\cb-skip.h', './src/pylibde265\\libde265\\encoder\\algo\\cb-split.cc', './src/pylibde265\\libde265\\encoder\\algo\\cb-split.h', './src/pylibde265\\libde265\\encoder\\algo\\CMakeLists.txt', './src/pylibde265\\libde265\\encoder\\algo\\coding-options.cc', './src/pylibde265\\libde265\\encoder\\algo\\coding-options.h', './src/pylibde265\\libde265\\encoder\\algo\\ctb-qscale.cc', './src/pylibde265\\libde265\\encoder\\algo\\ctb-qscale.h', './src/pylibde265\\libde265\\encoder\\algo\\Makefile.am', './src/pylibde265\\libde265\\encoder\\algo\\Makefile.in', './src/pylibde265\\libde265\\encoder\\algo\\pb-mv.cc', './src/pylibde265\\libde265\\encoder\\algo\\pb-mv.h', './src/pylibde265\\libde265\\encoder\\algo\\tb-intrapredmode.cc', './src/pylibde265\\libde265\\encoder\\algo\\tb-intrapredmode.h', './src/pylibde265\\libde265\\encoder\\algo\\tb-rateestim.cc', './src/pylibde265\\libde265\\encoder\\algo\\tb-rateestim.h', './src/pylibde265\\libde265\\encoder\\algo\\tb-split.cc', './src/pylibde265\\libde265\\encoder\\algo\\tb-split.h', './src/pylibde265\\libde265\\encoder\\algo\\tb-transform.cc', './src/pylibde265\\libde265\\encoder\\algo\\tb-transform.h', './src/pylibde265\\libde265\\encoder\\CMakeLists.txt', './src/pylibde265\\libde265\\encoder\\encoder-context.cc', './src/pylibde265\\libde265\\encoder\\encoder-context.h', './src/pylibde265\\libde265\\encoder\\encoder-core.cc', './src/pylibde265\\libde265\\encoder\\encoder-core.h', './src/pylibde265\\libde265\\encoder\\encoder-intrapred.cc', './src/pylibde265\\libde265\\encoder\\encoder-intrapred.h', './src/pylibde265\\libde265\\encoder\\encoder-motion.cc', './src/pylibde265\\libde265\\encoder\\encoder-motion.h', './src/pylibde265\\libde265\\encoder\\encoder-params.cc', './src/pylibde265\\libde265\\encoder\\encoder-params.h', './src/pylibde265\\libde265\\encoder\\encoder-syntax.cc', './src/pylibde265\\libde265\\encoder\\encoder-syntax.h', './src/pylibde265\\libde265\\encoder\\encoder-types.cc', './src/pylibde265\\libde265\\encoder\\encoder-types.h', './src/pylibde265\\libde265\\encoder\\encpicbuf.cc', './src/pylibde265\\libde265\\encoder\\encpicbuf.h', './src/pylibde265\\libde265\\encoder\\Makefile.am', './src/pylibde265\\libde265\\encoder\\Makefile.in', './src/pylibde265\\libde265\\encoder\\sop.cc', './src/pylibde265\\libde265\\encoder\\sop.h', './src/pylibde265\\libde265\\fallback-dct.cc', './src/pylibde265\\libde265\\fallback-dct.h', './src/pylibde265\\libde265\\fallback-motion.cc', './src/pylibde265\\libde265\\fallback-motion.h', './src/pylibde265\\libde265\\fallback.cc', './src/pylibde265\\libde265\\fallback.h', './src/pylibde265\\libde265\\image-io.cc', './src/pylibde265\\libde265\\image-io.h', './src/pylibde265\\libde265\\image.cc', './src/pylibde265\\libde265\\image.h', './src/pylibde265\\libde265\\intrapred.cc', './src/pylibde265\\libde265\\intrapred.h', './src/pylibde265\\libde265\\Makefile.am', './src/pylibde265\\libde265\\Makefile.in', './src/pylibde265\\libde265\\Makefile.vc7', './src/pylibde265\\libde265\\md5.cc', './src/pylibde265\\libde265\\md5.h', './src/pylibde265\\libde265\\motion.cc', './src/pylibde265\\libde265\\motion.h', './src/pylibde265\\libde265\\nal-parser.cc', './src/pylibde265\\libde265\\nal-parser.h', './src/pylibde265\\libde265\\nal.cc', './src/pylibde265\\libde265\\nal.h', './src/pylibde265\\libde265\\pps.cc', './src/pylibde265\\libde265\\pps.h', './src/pylibde265\\libde265\\quality.cc', './src/pylibde265\\libde265\\quality.h', './src/pylibde265\\libde265\\refpic.cc', './src/pylibde265\\libde265\\refpic.h', './src/pylibde265\\libde265\\sao.cc', './src/pylibde265\\libde265\\sao.h', './src/pylibde265\\libde265\\scan.cc', './src/pylibde265\\libde265\\scan.h', './src/pylibde265\\libde265\\sei.cc', './src/pylibde265\\libde265\\sei.h', './src/pylibde265\\libde265\\slice.cc', './src/pylibde265\\libde265\\slice.h', './src/pylibde265\\libde265\\sps.cc', './src/pylibde265\\libde265\\sps.h', './src/pylibde265\\libde265\\threads.cc', './src/pylibde265\\libde265\\threads.h', './src/pylibde265\\libde265\\transform.cc', './src/pylibde265\\libde265\\transform.h', './src/pylibde265\\libde265\\util.cc', './src/pylibde265\\libde265\\util.h', './src/pylibde265\\libde265\\visualize.cc', './src/pylibde265\\libde265\\visualize.h', './src/pylibde265\\libde265\\vps.cc', './src/pylibde265\\libde265\\vps.h', './src/pylibde265\\libde265\\vui.cc', './src/pylibde265\\libde265\\vui.h', './src/pylibde265\\libde265\\x86', './src/pylibde265\\libde265\\x86\\CMakeLists.txt', './src/pylibde265\\libde265\\x86\\Makefile.am', './src/pylibde265\\libde265\\x86\\Makefile.in', './src/pylibde265\\libde265\\x86\\sse-dct.cc', './src/pylibde265\\libde265\\x86\\sse-dct.h', './src/pylibde265\\libde265\\x86\\sse-motion.cc', './src/pylibde265\\libde265\\x86\\sse-motion.h', './src/pylibde265\\libde265\\x86\\sse.cc', './src/pylibde265\\libde265\\x86\\sse.h']
Extensions complete
dist\pylibde265-0.0.1-cp312-cp312-win_amd64.whl

Simply changing the default language in windows Settings does not change the output of msvc. So at least we can't simply get the system language to determine the encoding