quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.99k stars 328 forks source link

coexistence of R and Python gives me inconsistent table renderings #3457

Open aborruso opened 2 years ago

aborruso commented 2 years ago

Bug description

Hi, I have really strange table rendering problems.

My code is the one below. If I render it, the python ouput table is printed as plain text

image

If I change {r} in {{r}}, and render again the file, I have the correct rendering of the python ouput table

image

It seems to me that something does not work properly, but it seems to me that the code has no errors.

I'm using quarto 1.3.21, in debian 11 (inside WSL2).

Thank you

---
title: "tables test"
format: html
---

```{python}
import pandas as pd

df = pd.read_csv("input.csv")
df
```

```{r}
library(tidyverse)
data <- read_csv('input.csv')
knitr::kable(data)
```
year,i,v
2016,F,0.9599716561118586
2016,G,0.0382418519682473
2016,C,0.0012657864805693667
2016,W,1.2161279236855405e-05
2016,S,4.5402109150926846e-05

Checklist

cscheid commented 2 years ago

This is probably an inconsistency between the table rendering choices among the jupyter and knitr engines, especially in the presence of multiple-language configurations. If you must have perfectly consistent table rendering output, make sure your table libraries all emit pure markdown. (You can check by keep-md: true and inspecting the output.)

aborruso commented 2 years ago

If you must have perfectly consistent table rendering output, make sure your table libraries all emit pure markdown.

No they do not emit pure markdown.

The first one, the python output, it's simply text. But if I remove the r output, without changing anything in the python code, I have a markdown table output.

In my opinion this inconsistency is a bug. Unfortunately I don't know how to help you with coding, otherwise I would try to do it

Thank you @cscheid

::: {.cell-output .cell-output-stdout}
```
   year    i         v
0  2016    F  0.959972
1  2016    G  0.038242
2  2016  NaN  0.001266
3  2016    W  0.000012
4  2016    S  0.000045
```
:::
:::

::: {.cell}

```{.r .cell-code}
library(tidyverse)
data <- read_csv('input_s.csv')
knitr::kable(data)
```

::: {.cell-output-display}
| year|i  |         v|
|----:|:--|---------:|
| 2016|F  | 0.9599717|
| 2016|G  | 0.0382419|
| 2016|NA | 0.0012658|
| 2016|W  | 0.0000122|
| 2016|S  | 0.0000454|
:::
:::
cscheid commented 1 year ago

The issue here is that python, in a a page with two languages (python+r), gets executed through reticulate+knitr, while a page with a single language gets executed through jupyter.

In order for this to be fixed, reticulate needs to change the way they produce output from the default printing method from Pandas. It appears that right now they're emitting pure text, but it should emit markdown if inside knitr.

cscheid commented 1 year ago

This is currently an open issue for this in reticulate: https://github.com/rstudio/reticulate/issues/783. I'm currently in contact with the reticulate devs, so no need to follow up there.

aborruso commented 1 year ago

Thank you very much.

For this kind of issues probably I have created some noise here and in discussions.

And I'm happy that it has some sense.

Please forgive me for the confusion I made

cderv commented 1 year ago

Just an update on how reticulate behave:

Regarding how pandas table are handled, they do catch Pandas table and print them asis without processing. This happens instead of usual behavior to call the internal _repr_html_ method. https://github.com/rstudio/reticulate/blob/a1d7f7f573f652212bc2c72c39317340e6d8b511/R/knitr-engine.R#L576-L580

Changing that in reticulate would produce the same table in the intermediate .md and same processing by Quarto.