rstudio / blogdown

Create Blogs and Websites with R Markdown
https://pkgs.rstudio.com/blogdown/
1.74k stars 331 forks source link

Why are tabs replaced with spaces in code blocks #740

Closed amarakon closed 2 years ago

amarakon commented 2 years ago

If I add a code block in an R Markdown document, the resulting Markdown file will have tabs replaced with four spaces in that code block. I don't know why this is the default behaviour, and I wish there was a way to change it. Is there currently a way to make it so that tabs are preserved in code blocks? And why are they not by default?

As a reference, see this topic on community.rstudio.com: https://community.rstudio.com/t/how-can-i-prevent-r-markdown-from-turning-tabs-to-spaces-in-code-blocks/152691/8

Checklist

When filing a bug report, please check the boxes below to confirm that you have provided us with the information we need. Have you:


Output of xfun::session_info("rmarkdown"):

R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Gentoo Linux

Locale:
  LC_CTYPE=en_CA.utf8       LC_NUMERIC=C
  LC_TIME=en_CA.utf8        LC_COLLATE=en_CA.utf8
  LC_MONETARY=en_CA.utf8    LC_MESSAGES=C
  LC_PAPER=en_CA.utf8       LC_NAME=C
  LC_ADDRESS=C              LC_TELEPHONE=C
  LC_MEASUREMENT=en_CA.utf8 LC_IDENTIFICATION=C

Package version:
  bslib_0.4.1     evaluate_0.18   htmltools_0.5.3 jquerylib_0.1.4
  jsonlite_1.8.3  knitr_1.41      methods_4.2.1   rmarkdown_2.18
  stringr_1.4.1   tinytex_0.42    tools_4.2.1     utils_4.2.1
  xfun_0.35       yaml_2.3.6

Pandoc version: 2.19.2
cderv commented 2 years ago

If I add a code block in an R Markdown document, the resulting Markdown file will have tabs replaced with four spaces in that code block.

Are you sure that after typing the tabs in the .Rmd documents, it is still tabs in the .Rmd documents ? I mean before any conversion by Rmarkdown.

I ask because IDE are able to transformed tabs to space in their editor for code image

In IDE you can usually shows the whitespaces characters image

You'll see that for example dots for spaces and dashes for tabs.

If this is happening then your source files does not contains tabs, but spaces. hence the result contains spaces.

Otherwise, please can you share a .Rmd document with the tabs for us to check this out.

Thank you

amarakon commented 2 years ago

Are you sure that after typing the tabs in the .Rmd documents, it is still tabs in the .Rmd documents ? I mean before any conversion by Rmarkdown.

Sorry, that is not what I meant. I do not use the RStudio IDE. I meant, when I insert a tab character in a code chunk, the .md file (generated by rmarkdown::render()), will have four spaces instead of that tab character.

.Rmd:

    example code chunk
^
|----- Here is a tab character

.md:

    example code chunk
^
|----- Here are four spaces
amarakon commented 2 years ago

I forgot to mention that this seems to happen in blogdown when blogdown.method is set to "markdown".

cderv commented 2 years ago

Sorry, that is not what I meant. I do not use the RStudio IDE.

Ok good. I wanted to check. Also this could happen with any other IDE using the same feature.

I forgot to mention that this seems to happen in blogdown

So you confirm this does not happen in regular R Markdown document ? Like with html_document ? If you can share a reproducible example to be sure we are using the same thing.

I believe when using Pandoc to convert, by default tabs are converted to spaces, unless the --preserve-tabs option is passed. https://pandoc.org/MANUAL.html#option--preserve-tabs We don't set this option by default in blogdown I believe.

Using pandoc_args from html_page() format should allow you to specify that when using HTML format as output.

You said you are using Markdown, and in that case we run Pandoc in specific case to convert from knitted md to another md. And in that case, not possible to customize...

I'll move this to blogdown if you confirm this is only for this type of output, and look closer.

amarakon commented 2 years ago

So you confirm this does not happen in regular R Markdown document ? Like with html_document ? If you can share a reproducible example to be sure we are using the same thing.

This does happen in a regular R Markdown document, but it can be easily changed by using the preserve-tabs Pandoc argument:

---
title: Indent Test
output:
  html_document:
    pandoc_args: --preserve-tabs
---
example code block with indent

However, in blogdown with blogdown.method = "markdown", it does not use Pandoc. Instead, it just converts the .Rmd file to a .md file then uses a Markdown interpreter like Goldmark to interpret it into HTML. Pandoc is not used in this case. So I think it is an issue with just blogdown.

cderv commented 2 years ago

This does happen in a regular R Markdown document, but it can be easily changed by using the preserve-tabs Pandoc argument

Thanks for confirming that --preserve-tabs in the source of the observed behavior.

However, in blogdown with blogdown.method = "markdown", it does not use Pandoc. Instead, it just converts the .Rmd file to a .md file then uses a Markdown interpreter like Goldmark to interpret it into HTML. Pandoc is not used in this case. So I think it is an issue with just blogdown.

blogdown uses knitr to convert for Rmd to .md, and my use Pandoc in certain case to handle some specific pandoc-related content that could have been inserted in the document. In that case, pandoc is use to convert from md to md. However, there is no way currently to pass some option to this.

@yihui I see several options here:

What do you think ?

We maybe could move this to blogdown repo

amarakon commented 2 years ago

I like the idea of a preserve_tabs option.

yihui commented 2 years ago

I don't want to make this too general and complicated, but just added --preserve-tabs to the default value of pandoc_args. I don't think this is an option that is going to be widely used, so I tend to keep it simple for now. Thanks!

BTW, if anyone doesn't like this change (i.e., prefers tabs to be replaced by spaces), they have two choices. One is to use spaces instead of tabs in the input document, and the other is to provide custom pandoc_args values to blogdown::html_page. I guess the demand should be rare.