quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.74k stars 305 forks source link

Quarto publish does not work with separate output-dir #5220

Closed AaronGullickson closed 8 months ago

AaronGullickson commented 1 year ago

Bug description

Vanilla projects (not book or website) cannot use the publish command correctly when a separate output directory is being used. This appears to be a consequence of not being able to correctly find the path to the output file. To reproduce, clone this template project and then run:

quarto publish quarto-pub analysis/analysis.qmd 

When I do this, I get the following output:

ERROR: NotFound: No such file or directory (os error 2), stat '/Users/aarong/projects/research-template/analysis/_products/analysis.html'

NotFound: No such file or directory (os error 2), stat '/Users/aarong/projects/research-template/analysis/_products/analysis.html'
    at Object.statSync (deno:runtime/js/30_fs.js:322:9)
    at file:///Applications/quarto/bin/quarto.js:121332:18
    at Array.reduce (<anonymous>)
    at normalizePublishFiles (file:///Applications/quarto/bin/quarto.js:121330:45)
    at renderForPublish (file:///Applications/quarto/bin/quarto.js:121282:24)
    at async renderForPublish (file:///Applications/quarto/bin/quarto.js:109061:24)
    at async handlePublish (file:///Applications/quarto/bin/quarto.js:108967:26)
    at async publishDocument (file:///Applications/quarto/bin/quarto.js:121321:38)
    at async publish5 (file:///Applications/quarto/bin/quarto.js:121417:132)
    at async doPublish (file:///Applications/quarto/bin/quarto.js:121373:13)

It appears that the problem has to do with how deeply nested the qmd files are in this case. Quarto seems to be expecting it to be only nested one level deep in the project directory structure and as a result tries to find the output directory in the analysis subdirectory rather than in the project root directory. Changing the project type to default and changing the execution directory to project had no effect on this error. Running the command from the analysis subdirectory also had no effect.

This is not working as of the nightly release v1.3.326 on RStudio 2022.12.0 Build 353 on Mac OSX.

Output of quarto check:

[✓] Checking versions of quarto binary dependencies...
      Pandoc version 3.1.1: OK
      Dart Sass version 1.55.0: OK
[✓] Checking versions of quarto dependencies......OK
[✓] Checking Quarto installation......OK
      Version: 1.3.326
      Path: /Applications/quarto/bin

[✓] Checking basic markdown render....OK

[✓] Checking Python 3 installation....OK
      Version: 3.11.2
      Path: /Library/Frameworks/Python.framework/Versions/3.11/bin/python3
      Jupyter: 5.2.0
      Kernels: python3

(|) Checking Jupyter engine render....0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
[✓] Checking Jupyter engine render....OK

[✓] Checking R installation...........OK
      Version: 4.2.2
      Path: /Library/Frameworks/R.framework/Resources
      LibPaths:
        - /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library
      knitr: 1.41
      rmarkdown: 2.18

[✓] Checking Knitr engine render......OK

Checklist

AaronGullickson commented 1 year ago

I wanted to add a note that tried setting the execution directory as the project directory in the _quarto.yml document and the bug persists.

mcanouil commented 1 year ago

The issue in your case, is that you are in a project, and you try to publish a document inside the project as not part of the project, i.e., it's not currently possible to render/publish a file independently of the project as you want to do. Also, valid projects are book or website, not a collection of documents at the moment.

On a side note, you might want to use renv instead of your check_packages.R script as a pre-render script.

The solution for your use case would be to use Project profile.

I made a PR showing this (and adding renv): https://github.com/AaronGullickson/research-template/pull/1

mcanouil commented 1 year ago

@dragonstyle Do you think there is something to be done in Quarto CLI here?

AaronGullickson commented 1 year ago

Actually, rendering files independently of the project works fine. You can test this for yourself with:

quarto render analysis/analysis.qmd

This works fine and places the resulting analysis.html file into the _output directory as expected. Furthermore, the documentation on the quarto website is inconsistent with the claim that only books or websites are valid projects. Literally, everything else about my project works fine except for this important issue.

Your solution of using a separate _quarto-publish.yml file will allow it to be published but the resulting rendered output files then do not properly go to the _output directory which basically defeats the purpose of having an _output directory, which is to keep the project tidy and separate potential artifacts from scripts. Therefore, it will not work for my use case.

From my point of view, the fact that render works but publish does not suggests that this is not intended behavior and should be fixed. It also seems like it is a relatively straightforward problem of paths.

Let me explain my use case so you understand why this issue is important. It is not simply a "collection of files" but is designed for a very specific purpose. The purpose of this template is to provide a research template for academic researchers that focuses on transparency, openness, and replicability. Separating rendered output from scripts is critical as a logical step to identify rendered artifacts that may be out of date and should be removed prior to re-running the project. Everything works great at the moment, except for the publishing issue. Publishing is important because it allows collaborators (including ones who may not be directly involved in the coding) to see the most recent results from the project. Right now, I am literally having to email collaborators html files, which is extremely frustrating when I know there is a much easier way to do this.

mcanouil commented 1 year ago

Ok, I know what's going on, the issue is when publishing document in sub-directories, i.e., the whole tree is in output-dir but this is not the path in the "publish" command.

In the end, this issue is related to #5765 in some aspects, i.e., sub-directories + output-dir does not work well at the moment.