Open debdagybra opened 8 months ago
Same with
c("a", NA, "b")
returns "a", ".na.character", "b"
lubridate::as_date(c("2024-03-26", NA))
returns "19808", ".na.real"
Indeed. Thanks for the report.
We are using yaml R package for writing the R objects, and they do use those special values because YAML spec does not have something for NA https://github.com/vubiostat/r-yaml/blob/81f8903232bf125853901f62cdff3934b96eb1a5/inst/CHANGELOG#L112-L124
What would you expect a NA value in R to be in YAML ?
I don't think R NA
can be represented in YAML without loosing information;
We could either add a handler for converting to NULL
but this could cause issue probably to forcibly coerce. NULL
and NA
are not the same in R.
To be conservative, we could error when we detect any NA in the conversion, asking to check the execute_params
object. Using --execute-params
CLI flag to quarto render
you would not be able to add NA.
Curious of your thought on this.
I don't have enough knowlegde about YAML to answer your question about the NA's.
I don't think you should return an error when any value in execute_params
is NA. It's quite restrictive and since it's allowed in basic R to have vectors containing NA, its should be possible to use them with quarto.
But the original object class and values should be preserved when called with params$
, shouldn't they ?
It's odd to put a numeric vector in execute_params
and get a character vector in the .qmd file.
With rmarkdown
, the vector is well preserved.
library(rmarkdown)
rmarkdown::render("test.qmd",
params = list(
test_vec = c(1, NA, 2.5, -6.33, NaN, Inf)
))
returns:
class: numeric values: 1, NA, 2.5, -6.33, NaN, is.na: FALSE, TRUE, FALSE, FALSE, TRUE, FALSE
class: numeric values: 1, NA, 2.5, -6.33, NaN, is.na: FALSE, TRUE, FALSE, FALSE, TRUE, FALSE
And in quarto
itself, for logical vectors, the class is preserved correctly.
library(quarto)
quarto_render("test.qmd",
execute_params = list(
test_vec = c(TRUE, NA, FALSE)
))
returns :
class: logical values: TRUE, NA, FALSE is.na: FALSE, TRUE, FALSE
class: logical values: TRUE, NA, FALSE is.na: FALSE, TRUE, FALSE
Let me reintroduce the context here.
quarto_render()
is wrapper around quarto render
for which one of the flag --execute-params
which can take a YAML file to defined parameter. Doc is here https://quarto.org/docs/computations/parameters.html#rendering
quarto render document.qmd --execute-params params.yml
This means that for quarto, the only way to pass parameters is to use YAML syntax. And YAML syntax does not know about R objects.
Now, quarto_render()
R function is a wrapper as I said, and instead of just asking for a YAML file to be provided as argument, a R list of object can be passed and the R function will take care of writing the YAML to pass to Quarto.
This is where the big difference here is with rmarkdown::render()
where parameters are directly processed in R because rmarkdown is directly running R. Quarto is not.
So for example you can't pass a dataframe, or any other R specific object directly to quarto render
because you would not be able to provide this as a YAML value. And so you cannot either in quarto_render()
because there is no conversion in YAML spec for such object.
NA
and its family is among those objects - there is no 1-1 representation in YAML. So when I asked the question "what value would you expect", this means :
If you were to use CLI with quarto render
and not calling from R, how would you set up your params? You would not be able to pass some values.
So that is why I am thinking of an error if unsupported values are passed to execute_params
because they are just not supported in quarto. Unfortunately, this is a limitation and you can't pass specific R objects.
For more example, this has been discussed also at
I won't close this here though because we indeed need to do something (prevent rendering or do force coercion to NULL ?) to avoid this .na.real
problem.
Maybe in the future we'll find a solution in Quarto to have an API for yaml params that handles computation language specifics
Thanks for the explanation and the new epic :)
I still don't understand why it's working with logical vectors.
According to the link you sent in your first message, if I understand well, the vector c(TRUE, NA, FALSE)
, should be converted to character c("TRUE", ".na", "FALSE")
? But it doesn"t, instead we get c(TRUE, NA, FALSE)
.
Can't we do the same with numeric and character vectors ?
So for example you can't pass a dataframe, or any other R specific object directly to quarto render because you would not be able to provide this as a YAML value. And so you cannot either in quarto_render() because there is no conversion in YAML spec for such object.
When I pass a dataframe or a tibble, I get also a df or tibble with params$
, so I guess that quarto_render
has done some magic to pass the class and/or attributes to the yaml ?
Maybe I'm naive but can we also pass the class of vectors in order to convert them back later ?
According to the link you sent in your first message, if I understand well, the vector c(TRUE, NA, FALSE), should be converted to character c("TRUE", ".na", "FALSE") ? But it doesn"t, instead we get c(TRUE, NA, FALSE).
Oh that is interesting ! Thanks for pointing this out!
This is an issue from trying to solve #124 with https://github.com/quarto-dev/quarto-r/commit/5207b6c1bfe2e3dca12d45644227945c5312abf3 https://github.com/quarto-dev/quarto-r/blob/ba8485a53fac80256d3301519455d3411c2be7a2/R/utils.R#L6-L16
The handler for logical doesn't not handle NA specifically, and so if it encounters NA logical, it will use NA
as verbatim instead of the .na
which is what yaml::as.yaml()
would have output.
It seems it does not cause issues for a .qmd
file using engine: knitr
, but it will for one using engine: jupyter
When I pass a dataframe or a tibble, I get also a df or tibble with params$, so I guess that quarto_render has done some magic to pass the class and/or attributes to the yaml ?
Can you share an example of this please ?
Maybe I'm naive but can we also pass the class of vectors in order to convert them back later ?
This is not as simple right now. Quarto is a tool to work with any computations engine, so anything done as a built-in feature must be working for R Python Julia and maybe other in the future. Hence also the EPIC as the parameter feature is not yet at that level.
Here we are:
So c(TRUE, NA, FALSE)
really became internally [ true, 'NA', false ]
in quarto which is wrong really, but seems to work (by chance) with knitr engine because this is read as jsonlite::parse_json(..., simplifyVector = TRUE)
which does the coercion from "NA"
as string to NA
as logical value
str(jsonlite::parse_json('[true, "NA", false]', simplifyVector = TRUE))
#> logi [1:3] TRUE NA FALSE
This is indeed R specific here. Python does not have NA equivalent I think.
Take this .Qmd file
---
title: "test"
format: html
---
```{python}
#| tags: [parameters]
#| echo: true
test_vec = "test_vec"
test_vec
If you render with your example,
````r
library(quarto)
quarto_render("index.qmd",
execute_params = list(
test_vec = c(TRUE, NA, FALSE)
))
The NA is a string
I got into some details, but I hope this illustrate why this is not as simple.
In R Markdown, rmarkdown::render(params = )
runs in R and pass the params as is without conversion to the rendering knitting processing.
So this explain the current limitation and why this require some more design (https://github.com/quarto-dev/quarto-cli/issues/9197) if we want to support an API for parameter that could allow engine specific consideration.
When I pass a dataframe or a tibble, I get also a df or tibble with params$, so I guess that quarto_render has done some magic to pass the class and/or attributes to the yaml ?
Can you share an example of this please ?
I was wrong, the dataframes and tibbles are converted to lists.
I thought they were preserved because the functions from dplyr
were still working.
By the way, the workaround you suggested here https://forum.posit.co/t/param-converted-from-data-frame-to-list/155556/8 with RDS files works very well ! Thanks!
Until it's resolved, maybe you can add a warning in quarto_render()
to notify the user that some data are modified (when NA or when dataframe, ...) and guide them to the workaround with RDS file. A warning could save them a lot of time.
Thanks for the feedback. I'll make it more apparent in the doc, and I'll probably throw an error for those specific R values that can't be translated. IMO, they shouldn't be used in execute_params
at all.
quarto_render() changes NA of numeric vectors to character ".na.real"
Similar to https://github.com/quarto-dev/quarto-r/issues/124(https://github.com/quarto-dev/quarto-r/issues/124)
Content of "test.qmd"
Created within qmd file:
class:
r class(test_vec)
values:
r test_vec
is.na:
r is.na(test_vec)