quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.87k stars 317 forks source link

Process ANSI escape code in non-HTML format for Jupyter errors #10347

Open lballabio opened 2 months ago

lballabio commented 2 months ago

Bug description

When using the PDF format for a book project, the generated LaTeX file contains ANSI color characters that cause the document compilation to fail.

A search brings up #7813, which looks like it should fix the problem, but that's not the case.

Steps to reproduce

Here is a minimal YML configuration:

project:
  type: book

execute:
  enabled: true

book:
  title: "A reproducible example"
  chapters:
    - index.qmd

format:
  pdf:
    documentclass: book

and here is the sample input file:

# An example

This outputs an ANSI-colorized traceback:

```{python}
# | error: true

1 + "2"

### Expected behavior

ANSI color characters should be stripped or translated so that the LaTeX file can compile and produce a PDF document.

### Actual behavior

The LaTeX compilation fails wih the following error:

ERROR: compilation failed- error Text line contains an invalid character. l.220 ^^[ [0;31m-------------------------------------------------------------...


### Your environment

Mac OS Sonoma 14.5, Python 3.12.4, a fresh virtual environment with only Jupyter and its dependencies installed.

### Quarto check output

Quarto 1.5.55 [✓] Checking versions of quarto binary dependencies... Pandoc version 3.2.0: OK Dart Sass version 1.70.0: OK Deno version 1.41.0: OK Typst version 0.11.0: OK [✓] Checking versions of quarto dependencies......OK [✓] Checking Quarto installation......OK Version: 1.5.55 Path: /Applications/quarto/bin

[✓] Checking tools....................OK TinyTeX: v2024.07.03 Chromium: (not installed)

[✓] Checking LaTeX....................OK Using: TinyTex Path: /Users/lballabio/Library/TinyTeX/bin/universal-darwin Version: 2024

[✓] Checking basic markdown render....OK

[✓] Checking Python 3 installation....OK Version: 3.12.4 Path: /Users/lballabio/Downloads/test/.venv/bin/python3 Jupyter: 5.7.2 Kernels: python3

[✓] Checking Jupyter engine render....OK

[✓] Checking R installation...........(None)

  Unable to locate an installed version of R.
  Install R from https://cloud.r-project.org/
prs513rosewood commented 2 months ago

I just encountered the same error with the following install (on openSUSE Tumbleweed):

Quarto 1.5.55
[✓] Checking versions of quarto binary dependencies...
      Pandoc version 3.2.0: OK
      Dart Sass version 1.70.0: OK
      Deno version 1.41.0: OK
      Typst version 0.11.0: OK
[✓] Checking versions of quarto dependencies......OK
[✓] Checking Quarto installation......OK
      Version: 1.5.55
      Path: /opt/quarto/bin

[✓] Checking tools....................OK
      TinyTeX: (not installed)
      Chromium: (not installed)

[✓] Checking LaTeX....................OK
      Using: Installation From Path
      Path: /opt/texlive/bin/x86_64-linux
      Version: 2024

[✓] Checking basic markdown render....OK

[✓] Checking Python 3 installation....OK
      Version: 3.11.9
      Path: /home/$USER/.local/share/venv/discret/bin/python3
      Jupyter: 5.7.1
      Kernels: python3, python3.11

[✓] Checking Jupyter engine render....OK

[✓] Checking R installation...........(None)

      Unable to locate an installed version of R.
      Install R from https://cloud.r-project.org/
cderv commented 2 months ago

It seems indeed #7813 is not enough. It will handle only some handling for html output content having ANSI Escape https://github.com/quarto-dev/quarto-cli/blob/0d87b08a5c1b25ec6c6573e0bed793cd67eb6687/src/core/jupyter/jupyter.ts#L1790-L1803

This means ANSI will be kept in PDF output by Quarto. Here is the .md

# An example

This outputs an ANSI-colorized traceback:

::: {.cell execution_count=1}
``` {.python .cell-code}
1 + "2"

::: {.cell-output .cell-output-error}

TypeError: unsupported operand type(s) for +: 'int' and 'str'
[1;31m---------------------------------------------------------------------------[0m
[1;31mTypeError[0m                                 Traceback (most recent call last)
Cell [1;32mIn[1], line 1[0m
[1;32m----> 1[0m [38;5;241;43m1[39;49m[43m [49m[38;5;241;43m+[39;49m[43m [49m[38;5;124;43m"[39;49m[38;5;124;43m2[39;49m[38;5;124;43m"[39;49m

[1;31mTypeError[0m: unsupported operand type(s) for +: 'int' and 'str'

::: :::



So ANSI are not stripped nor handle in Jupyter output when inside non-HTML format.  If we can, we should probably try to strip them out. 

Otherwise, we could need a way to configure Jupyter execution for Python cells to not use ANSI in output when we know we'll be producing a non-HTML format. 
cderv commented 2 months ago

Prior and related work / ideas:

So this is another case for handling ANSI in all our engine