Support non-ASCII characters in crossreferenceable labels #9405

Open nighthink opened 3 months ago

nighthink commented 3 months ago

Bug description

I used the jupyter engine and python code to draw two different bar charts using the matplotlib drawing library. In jupyter they would output correctly, but with quarto preview, he outputs two identical images. The reason is that #| label: fig-qxz contains Chinese characters.

Steps to reproduce

title: aaa
author: aaa
    number-sections: true

  echo: false
  warning: false
fig-align: center
  render-on-save: true
engine: jupyter

import numpy
import matplotlib.pyplot as plt
from matplotlib import rcParams
import pandas as pd
%config InlineBackend.figure_format = 'svg'
#| label: fig-你好
#| fig-cap: aaa

df = pd.DataFrame({'Categories': ['A', 'B', 'C'], 'Values': [10, 20, 30]})

fig1, ax1 = plt.subplots(figsize=(8, 2.5))
ax1.bar(df["Categories"], df["Values"])
#| label: fig-世界
#| fig-cap: bbb

df2 = pd.DataFrame({'Categories': ['D', 'E', 'F'], 'Values': [30,20,10]})

fig2, ax2 = plt.subplots(figsize=(8, 2.5))
ax2.bar(df2["Categories"], df2["Values"])

### Expected behavior

![屏幕截图 2024-04-18 204029](https://github.com/quarto-dev/quarto-cli/assets/38828499/303ab99c-d620-479f-925b-7cc3d0b32205)

### Actual behavior


### Your environment

- IDE: VS Code
- Windows 11

### Quarto check output

Quarto 1.5.30
mcanouil commented 3 months ago

@nighthink Could you properly format your post using code blocks for code and terminal outputs? Thanks. If your code contains code blocks, you need to enclose it using more backticks, i.e., usually four ````.

You can share a self-contained "working" (reproducible) Quarto document using the following syntax, i.e., using more backticks than you have in your document (usually four ````). See https://quarto.org/bug-reports.html#small-is-beautiful-aim-for-a-single-document-with-10-lines.

If you have multiple files (and if it is absolutely required to have multiple files), please share as a Git repository.

`````md ````qmd --- title: "Reproducible Quarto Document" format: html engine: knitr --- This is a reproducible Quarto document. ```{r} x <- c(1, 2, 3, 4, 5) y <- c(1, 4, 9, 16, 25) plot(x, y) ``` ![An image](https://placehold.co/600x400.png) The end. ```` ````` `````md ````qmd --- title: "Reproducible Quarto Document" format: html engine: jupyter --- This is a reproducible Quarto document. ```{python} import matplotlib.pyplot as plt x = [1, 2, 3, 4, 5] y = [1, 4, 9, 16, 25] plt.plot(x, y) plt.show() ``` ![An image](https://placehold.co/600x400.png) The end. ```` `````

Additionally and if not already given, please share the output of quarto check within a code blocks (i.e., using three backticks ```txt), see https://quarto.org/bug-reports.html#check.

nighthink commented 3 months ago

cscheid commented 3 months ago

@nighthink Please keep to English in these forums.

mcanouil commented 3 months ago

@cscheid I believe it's an automatic translation of my comment quoted.

@nighthink when replying, there is no needs to quote an entire message. When there is a reply it is assumed to be for the whole message, if not then, you can quote a part of it to make it clear you are replying to that part. Also, you can edit your original post instead of posting a new message because in this message, the example from the original post is still unusable whilst it's the first thing people coming here will see.

nighthink commented 3 months ago

@mcanouil thanks for your reminder. I have modified my your original post. @cscheid Sorry, it is a translation of mcanouil's reply.

mcanouil commented 2 months ago

The issue is that the characters are used to write the images and in this case they are "ignored", leading to:

::: {.cell execution_count=2}

::: {.cell-output .cell-output-display}

::: {.cell execution_count=3}

::: {.cell-output .cell-output-display}

As you can see, the path is the same. Note that this only happens when engine: jupyter. (Was unable to try with engine: julia).

And the issue is not at all specific to Typst as the intermediate markdown will be the same for the other formats.