rstudio / bookdown

Authoring Books and Technical Documents with R Markdown
https://pkgs.rstudio.com/bookdown/
GNU General Public License v3.0
3.71k stars 1.27k forks source link

"Error in `path_to_connection()`" if one of the files to be compiled has the same name as the final file although in a subfolder #1464

Open piiskop opened 3 months ago

piiskop commented 3 months ago

The content of my _bookdown.yml:

book_filename: "mehaanika"
new_session: false
delete_merged_file: true
rmd_files: [
  "index.Rmd",
  # "UntitledRMD.Rmd",
  "rmd/hindamine.Rmd",
  "rmd/soovitusi-õppimiseks.Rmd",
  "rmd/baasreeglistik/baasreeglistik.Rmd",
  "rmd/mehaanika.Rmd",
  "rmd/mõõtmiseksperiment.Rmd",
  "references.Rmd"
]
language:
  label:
    exr: 'Ülesanne '
    fig: 'Joonis '

The error message:

Error in path_to_connection(): ! mehaanika.html does not exist in current working directory (/home/kalmer/rstudio-projects/mehaanika).

If I exclude the file mehaanika.Rmd from the list of the files to be compiled then compiling finishes without an error. Even if I rename the file to something else and change the name in the list as well the same old error message reappears.

Only also renaming the first-level caption inside the file allows compiling without an error.

xfun::session_info('bookdown')

R version 4.3.3 (2024-02-29) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.4 LTS, RStudio 2023.3.3.539

Locale: LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

Package version: base64enc_0.1.3 bookdown_0.37 bslib_0.6.1 cachem_1.0.8 cli_3.6.2 digest_0.6.34 ellipsis_0.3.2 evaluate_0.23
fastmap_1.1.1 fontawesome_0.5.2 fs_1.6.3 glue_1.7.0 graphics_4.3.3 grDevices_4.3.3 highr_0.10 htmltools_0.5.7
jquerylib_0.1.4 jsonlite_1.8.8 knitr_1.45 lifecycle_1.0.4 magrittr_2.0.3 memoise_2.0.1 methods_4.3.3 mime_0.12
R6_2.5.1 rappdirs_0.3.3 rlang_1.1.3 rmarkdown_2.25 sass_0.4.8 stats_4.3.3 stringi_1.8.3 stringr_1.5.1
tinytex_0.49 tools_4.3.3 utils_4.3.3 vctrs_0.6.5 xfun_0.41 yaml_2.3.8

cderv commented 3 months ago

Thanks for the report - we are missing some details and information, so I will answer by guessing some of it. Please to share real information I am mistaken

Only also renaming the first-level caption inside the file allows compiling without an error.

Do you mean the h1 title like

# mehaanika

What format are you using ? bs4_book() I am guessing as path_to_connection() is from xml2. If you have the change could your share the traceback of the error ? traceback() or rlang::trace_back() ? We will know which part of bookdown is causing this. See about traceback at https://adv-r.hadley.nz/debugging.html#traceback

bs4_book will split the file by chapters meaning it will use the h1 header name (or id more explicitly) to create the HTML files, and if your book filename is the same, there may be a conflict there. The fact that source file is a subfolder is not involved here because this is handling the HTML file results only.

We need to reproduce to be able to fix this. For now, you need to rename the first header or set a custom index

# mehaanika {#mehaanika-1}

try this above ☝️

cderv commented 3 months ago

So I can reproduce, and this is indeed related to book_filename which will be used for the intermediate merged Rmd file and then for the name of the HTML file to be split.

Code where we write over the existing one being currently split One of the `nms` will be `mehaanika.html` - name of chapter inside `rmd/mehaanika.Rmd` The merged file will be `mehaanika.Rmd` because of book filename, with https://github.com/rstudio/bookdown/blob/f614e894c04f39704ec231b3b767b23290635627/R/html.R#L416-L447

From our doc : https://bookdown.org/yihui/bookdown/configuration.html

book_filename: the filename of the main Rmd file, i.e., the Rmd file that is merged from all chapters; by default, it is named _main.Rmd.

Is there any reason to configure this and change the default ? Is this for the download feature ? You could use book_filename: _mehaanika and it would solve the issue BTW.

I'll look more into that, but I think we'll add an error / warning or just rename the intermediate file at least temporarily.

Some more details, as it happens there:

https://github.com/rstudio/bookdown/blob/f614e894c04f39704ec231b3b767b23290635627/R/html.R#L416-L447

One of the nms will be mehaanika.html - name of chapter inside rmd/mehaanika.Rmd

The merged file will be mehaanika.Rmd because of book filename, which leads to mehaanika.html as main book intermediate HTML to be split by split_chapters().

And so there will be an overwrite. The initial output file is no more there for bs4_chapters_tweak() to work on.

This is happening because all the files are written into the main root project behavior being moved to output_dir

Definitely an issue to fix.

@yihui if you have a preference over the behavior that should happen, let me know.

Another really easy solution would be to always preprend by _ the book_filename provided. This could also create conflict for a chapter with same name, but it seems very unlikely to have a chapter name starting with _ or having an id starting with _ to create HTML files.

yihui commented 3 months ago

I think we should throw an error in this case, asking users to change either book_filename or the section ID that happens to be identical to book_filename.

Another really easy solution would be to always preprend by _ the book_filename provided. This could also create conflict for a chapter with same name, but it seems very unlikely to have a chapter name starting with _ or having an id starting with _ to create HTML files.

Yes, I think that's a good idea, too.