Open paciorek opened 8 months ago
Thanks for the report @paciorek !
It seems the codepage use is 936 on this computer. Quarto check does return unknown
though for some reason. So I don't know why this isn't read properly.
Can you update to Quarto 1.4 latest stable release and run quarto check
again ?
@dragonstyle do you still have local windows 11 with codepage 936 available ? Otherwise, I'll try to set one up.
Hi, have very similar issue. I updated quarto and it doesn't read my codepage.
C:\Program Files\Quarto\bin>chcp Active code page: 852
C:\Program Files\Quarto\bin>quarto check
Quarto 1.4.549
[>] Checking versions of quarto binary dependencies...
Pandoc version 3.1.11: OK
Dart Sass version 1.69.5: OK
Deno version 1.37.2: OK
[>] Checking versions of quarto dependencies......OK
[>] Checking Quarto installation......OK
Version: 1.4.549
Path: C:\Program Files\Quarto\bin
CodePage: unknown
[>] Checking tools....................OK
TinyTeX: (not installed)
Chromium: (not installed)
[>] Checking LaTeX....................OK
Tex: (not detected)
(|) Checking basic markdown render....Error running filter C:/Program Files/Quarto/share/filters/main.lua:
[string "..."]:267: cannot open file 'C:\Users\Rafa?\AppData\Local\Temp\quarto-sessiona3371d6a\60be6b22\128fbc86' (Invalid argument)
stack traceback:
[string "..."]:267: in function 'io.lines'
[string "..."]:1593: in field 'processDependencies'
C:/Program Files/Quarto/share/filters/main.lua:7347: in field 'Meta'
C:/Program Files/Quarto/share/filters/main.lua:240: in function 'run_emulated_filter'
C:/Program Files/Quarto/share/filters/main.lua:936: in local 'callback'
C:/Program Files/Quarto/share/filters/main.lua:954: in upvalue 'run_emulated_filter_chain'
C:/Program Files/Quarto/share/filters/main.lua:990: in function <C:/Program Files/Quarto/share/filters/main.lua:987>
[>] Checking basic markdown render....OK
@rpbartczuk what is you username here ? C:\Users\Rafa?
. I believe ?
is for another character that is not correctly read ?
I saw a similar error on a Japanese version of Windows (user name contains multibyte characters).
I think they (Pandoc?) are trying to interpret a non-UTF-8 string as UTF-8 and not interpreting the path correctly.
I am able to reproduce an error when I place the path to the quarto-cli
in a directory with unicode characters (using codepage 936). I haven't yet pinned down the issue, though it appears to be a file that we are passing to pandoc that is perhaps encoded incorrectly. A basic pandoc render in the path works fine.
It's likely that the code page isn't be displayed because if there is a render exception, we clear the code page from the cache (where it is read). I'm guessing this is causing the cached code page to disappear, and perhaps the check command is using that cached value rather than computing it.
This will reproduce in pure pandoc when the pandoc executable is placed within a unicode character path on a file system with non-english code page:
C:\Users\ct\你好>chcp
Active code page: 936
C:\Users\ct\你好>dir
Volume in drive C has no label.
Volume Serial Number is D466-B618
Directory of C:\Users\ct\你好
02/20/2024 12:52 PM <DIR> .
02/20/2024 12:52 PM <DIR> ..
12/16/2023 03:20 AM 214,419,968 pandoc.exe
02/20/2024 12:47 PM 64 test.lua
02/20/2024 10:46 AM 0 test.md
test.lua
function Pandoc(doc)
package.path = ""
require("foo")
end
test.md
Command
C:\Users\ct\你好>pandoc.exe test.md -L test.lua
(cc @tarleb)
The issue here is that Lua's package.path
built-in includes non-utf8 characters.
Those confuse pandoc.path
functions, which assume UTF-8 and ultimately corrupt path strings.
@cderv Charles and I are thinking that we should run the Windows test suite on a non-standard code page, even if we only do it once a week or so. The root cause here is that we're not actually seeing the behavior regress on these code pages, and we'd like to prevent it in the future.
@cderv Charles and I are thinking that we should run the Windows test suite on a non-standard code page, even if we only do it once a week or so. The root cause here is that we're not actually seeing the behavior regress on these code pages, and we'd like to prevent it in the future.
It makes sense, I can add nightly run for that special usage. (We could also run it for each pre-release tag created). Is this just different codepage or also some path tweaking with special character ?
We can sync directly and I'll add to the CI updates to do for 1.5.
Is this just different codepage or also some path tweaking with special character ?
We currently don't think we will be able to fully support Quarto installed on a path with non-ascii characters (it's a combination of Lua, Pandoc, and Windows bugs that we simply can't work around in generality right now). But we believe that there might be more bugs lurking if we were to even run the test suite on non-standard code pages, and we would like to support that use case well.
So we should start fixing the simpler cases first.
Work in progress here:
The work in progress addresses the most core issues with the following configuration:
Windows OS Code Page 936
User home directory includes unicode characters
Place Quarto within the user home directory
Run tests
[ ] Known issue with python path handling - we are failing to initialize the logger in log.py#19
likely due to incorrect path encoding
[ ] Known issue can occur when attempting to read temp file path (this has been transient so I haven't pinned down when it happens yet)
Bug description
I'm trying to help a student who seems to have the same problem on Windows reported in issue #4103 . (Let me know if I should reopen that issue instead.)
We've run the commands suggested there at the end the thread by @cderv and here are the results. Any suggestions that I can I try with the student?
Steps to reproduce
Rendering any basic Quarto document causes the problem.
Expected behavior
One should see the rendered doc.
Actual behavior
Your environment
Quarto check output