quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.61k stars 294 forks source link

Support non-ASCII characters in crossreferenceable labels #9405

Open nighthink opened 3 months ago

nighthink commented 3 months ago

Bug description

I used the jupyter engine and python code to draw two different bar charts using the matplotlib drawing library. In jupyter they would output correctly, but with quarto preview, he outputs two identical images. The reason is that #| label: fig-qxz contains Chinese characters.

Steps to reproduce

---
title: aaa
author: aaa
format:
  typst:
    number-sections: true

execute:
  echo: false
  warning: false
fig-align: center
editor:
  render-on-save: true
engine: jupyter
---

```{python}
import numpy
import matplotlib.pyplot as plt
from matplotlib import rcParams
import pandas as pd
%config InlineBackend.figure_format = 'svg'
#| label: fig-你好
#| fig-cap: aaa

df = pd.DataFrame({'Categories': ['A', 'B', 'C'], 'Values': [10, 20, 30]})

fig1, ax1 = plt.subplots(figsize=(8, 2.5))
ax1.bar(df["Categories"], df["Values"])
plt.show()
#| label: fig-世界
#| fig-cap: bbb

df2 = pd.DataFrame({'Categories': ['D', 'E', 'F'], 'Values': [30,20,10]})

fig2, ax2 = plt.subplots(figsize=(8, 2.5))
ax2.bar(df2["Categories"], df2["Values"])
plt.show()


### Expected behavior

![屏幕截图 2024-04-18 204029](https://github.com/quarto-dev/quarto-cli/assets/38828499/303ab99c-d620-479f-925b-7cc3d0b32205)

### Actual behavior

![error](https://github.com/quarto-dev/quarto-cli/assets/38828499/bfb92055-2ec2-4834-aa20-316be743b887)

### Your environment

- IDE: VS Code
- Windows 11

### Quarto check output

Quarto 1.5.30
mcanouil commented 3 months ago

@nighthink Could you properly format your post using code blocks for code and terminal outputs? Thanks. If your code contains code blocks, you need to enclose it using more backticks, i.e., usually four ````.


You can share a self-contained "working" (reproducible) Quarto document using the following syntax, i.e., using more backticks than you have in your document (usually four ````). See https://quarto.org/bug-reports.html#small-is-beautiful-aim-for-a-single-document-with-10-lines.

If you have multiple files (and if it is absolutely required to have multiple files), please share as a Git repository.

RPython
`````md ````qmd --- title: "Reproducible Quarto Document" format: html engine: knitr --- This is a reproducible Quarto document. ```{r} x <- c(1, 2, 3, 4, 5) y <- c(1, 4, 9, 16, 25) plot(x, y) ``` ![An image](https://placehold.co/600x400.png) The end. ```` ````` `````md ````qmd --- title: "Reproducible Quarto Document" format: html engine: jupyter --- This is a reproducible Quarto document. ```{python} import matplotlib.pyplot as plt x = [1, 2, 3, 4, 5] y = [1, 4, 9, 16, 25] plt.plot(x, y) plt.show() ``` ![An image](https://placehold.co/600x400.png) The end. ```` `````

Additionally and if not already given, please share the output of quarto check within a code blocks (i.e., using three backticks ```txt), see https://quarto.org/bug-reports.html#check.

nighthink commented 3 months ago

您能否使用代码块来正确格式化您的帖子和终端输出?谢谢。如果您的代码包含代码块,则需要使用更多的反引_号(即_通常为 4 个)将其括起来。````

您可以使用以下语法共享一个独立的“工作”(可重现)Quarto 文档,_即_使用比文档中更多的反引号(通常为 4 个)。请参见 https://quarto.org/bug-reports.html#small-is-beautiful-aim-for-a-single-document-with-10-lines````

如果您有多个文件(并且绝对需要拥有多个文件),请作为 Git 存储库共享。

R 蟒

````qmd
---
title: "Reproducible Quarto Document"
format: html
engine: knitr
---

This is a reproducible Quarto document.

```{r}
x <- c(1, 2, 3, 4, 5)
y <- c(1, 4, 9, 16, 25)

plot(x, y)

An image

The end.

````qmd
---
title: "Reproducible Quarto Document"
format: html
engine: jupyter
---

This is a reproducible Quarto document.

```{python}
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [1, 4, 9, 16, 25]

plt.plot(x, y)
plt.show()

An image

The end.

此外,如果尚未给出,请共享代码块内的输出(_即_使用三个反引号),请参阅 https://quarto.org/bug-reports.html#check。`quarto check txt ``

Yap!

---
title: aaa
author: aaa
format:
  typst:
    number-sections: true

execute:
  echo: false
  warning: false
fig-align: center
editor:
  render-on-save: true
engine: jupyter
---

```{python}
import numpy
import matplotlib.pyplot as plt
from matplotlib import rcParams
import pandas as pd
%config InlineBackend.figure_format = 'svg'
#| label: fig-你好
#| fig-cap: aaa

df = pd.DataFrame({'Categories': ['A', 'B', 'C'], 'Values': [10, 20, 30]})

fig1, ax1 = plt.subplots(figsize=(8, 2.5))
ax1.bar(df["Categories"], df["Values"])
plt.show()
#| label: fig-世界
#| fig-cap: bbb

df2 = pd.DataFrame({'Categories': ['D', 'E', 'F'], 'Values': [30,20,10]})

fig2, ax2 = plt.subplots(figsize=(8, 2.5))
ax2.bar(df2["Categories"], df2["Values"])
plt.show()
cscheid commented 3 months ago

@nighthink Please keep to English in these forums.

mcanouil commented 3 months ago

@cscheid I believe it's an automatic translation of my comment quoted.

@nighthink when replying, there is no needs to quote an entire message. When there is a reply it is assumed to be for the whole message, if not then, you can quote a part of it to make it clear you are replying to that part. Also, you can edit your original post instead of posting a new message because in this message, the example from the original post is still unusable whilst it's the first thing people coming here will see.

nighthink commented 3 months ago

@mcanouil thanks for your reminder. I have modified my your original post. @cscheid Sorry, it is a translation of mcanouil's reply.

mcanouil commented 2 months ago

The issue is that the characters are used to write the images and in this case they are "ignored", leading to:

::: {.cell execution_count=2}

::: {.cell-output .cell-output-display}
![aaa](index_files/figure-typst/fig--output-1.svg){#fig-你好}
:::
:::

::: {.cell execution_count=3}

::: {.cell-output .cell-output-display}
![bbb](index_files/figure-typst/fig--output-1.svg){#fig-世界}
:::
:::

As you can see, the path is the same. Note that this only happens when engine: jupyter. (Was unable to try with engine: julia).

And the issue is not at all specific to Typst as the intermediate markdown will be the same for the other formats.