quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.74k stars 305 forks source link

Detect QUARTO_VERSION on Rstudio Connect #2732

Closed gshotwell closed 1 year ago

gshotwell commented 1 year ago

Bug description

No response

Checklist

cscheid commented 1 year ago

Thanks for the report. Do you mean that RStudio Connect is not honoring QUARTO_VERSION? In that case, I think this should be reported on their repo, not here.

gshotwell commented 1 year ago

Oh, I'm so sorry for accidentally submitting this bug without any details about the actual bug.

Rstudio Connect admins maintain several versions of python which are specified in the /etc/rstudio-connect/rstudio-connect.gcfg file. When the user deploys content to Rstudio Connect it detects their local version of python and tries to match it with one of the installed versions on the system. This works for all of the types of content that Rstudio Connect hosts including Rmarkdown reports and Python APIs.

When deploying quarto documents, however, the system renders it using the system python version and doesn't go through the python resolution process. To get the expected behaviour the user needs to set a QUARTO_VERSION environment variable on the piece of content specifying the path. This is very awkward because the user might not know the path the the executable, and admins can't modify those paths without breaking pieces of hosted content. It would be better if quarto documents followed the same pattern as other hosted content.

I'm not sure how Quarto and Rstudio Connect divide responsibilities on this stuff, so feel free to close this issue and I can submit it to support.

cscheid commented 1 year ago

Thanks for the followup! I don't know much about how rsc works, so I'm not going to be much help here. But I suspect that if rsc is running quarto, then rsc should be configuring paths similarly to how virtual environments work. AIUI, Quarto doesn't do anything special here wrt Python; it tries to respect the ambient configuration (and I think that's the right approach).

cscheid commented 1 year ago

On further consideration, the env variables appear to be discussed here.

gshotwell commented 1 year ago

This might be a Posit Connect issue, but i would say from the user's side of things it's important that Posit supported open source projects work on their professional products. From a user perspective both quarto and Connect are just Posit products so it's natural to expect them to work well together.

While the environment file is helpful, it doesn't really address the problem because Connect publishers aren't always aware of the python binary paths on the server so setting environment variables is pretty awkward.

cscheid commented 1 year ago

(@aronatkins pinging you just to make sure you Connect folks are aware of this)

aronatkins commented 1 year ago

How are you deploying to Connect? The rsconnect CLI that is part of rsconnect-python can help capture the dependencies of Quarto content and communicate those requirements to Connect.

If you're deploying content that uses R, the rsconnect R package can be of some help.

The Connect User Guide tries to explain when each tool is able to assist with deployment. https://docs.posit.co/connect/user/quarto/

It's possible that I'm misunderstanding what you're attempting.

You're talking about a QUARTO_VERSION variable; is this because you're trying to run a Quarto command from some other thing (an R-based Shiny application, for example), and using QUARTO_VERSION to identify a path-to-Quarto to use in your application?

If I'm right, then you are correct - this is a situation that isn't fully handled by Connect today. The content you deployed (let's assume that it is a Shiny application) requires a specific version of R and some packages. The fact that your content needs a Quarto interpreter is not detected. If you are using QUARTO_VERSION to run Quarto directly, then it is also likely that the Python environment you expect is not being made available to Quarto.

Could you give us more information about the kinds of resources that you are trying to deploy? Some example content would be really helpful. If you're unable to share details here, feel free to open a support request.

gshotwell commented 1 year ago

Yes I tried this using both the R and Python utilities, and both times it detected the system python version on Rstudio Connect and not any of the versions defined in the rstudio-connect.gcfg it was only after setting the path in an environment variable on the content that the proper path was detected. I was just trying to deploy a test quarto document that used python code.

aronatkins commented 1 year ago

I've just followed these steps to deploy a Quarto project using Python to Connect.

Create a new directory to contain your project.

Into that directory, add a _quarto.yml file:

project:
  render:
    - index.qmd

Into that directory, add a index.qmd file:

title: "python3 demo" jupyter: python3


```{python}
2+3

Within that directory, create a `requirements.txt`:

jupyter


Within that directory, run the following commands to create a virtual environment, install the Python requirements, and install the `rsconnect` package (used to deploy):

```bash
python3 -m venv venv
source ./venv/bin/activate

python3 -m pip install --upgrade pip
python3 -m pip install -r requirements.txt
python3 -m pip install rsconnect

Confirm that you can render with Quarto (this render is local; in your development environment):

quarto render

Within that directory, deploy to Connect; you'll need the URL for your server and an API key. In this example, I've excluded files that were produced by the local render:

rsconnect deploy quarto \
    -s "${CONNECT_SERVER}" \
    -k "${CONNECT_API_KEY}" \
    --title "quarto python test" \
    -x "index.html" \
    -x "index_files" \
    -x "venv" \
    .

The deploy to Connect recognizes that the content uses Python and therefore includes information about the Python requirements in the deploy bundle.

@GShotwell - could you detail how your situation differs from this example? If you have details that you are not comfortable sharing here, would you please open a support case at https://support.posit.co/?

gshotwell commented 1 year ago

So Rstudio Connect allows you to include multiple versions of python which are defined in the config file. When you deploy an Rmarkdown document with python chunks it will resolve your local python version and select the closest python executable in the config file.

When you install quarto on the server, however, it uses the general system python version and doesn't resolve based on the users python installation. You won't notice the difference if the system version is the same as the version defined in the rstudio connect config.

So to replicate I would do the following: 1) Install multiple versions of python as per the recommendations of the docs 2) Install quarto 3) Set the python executible in the rstudio-connect.gcfg file 4) Deploy a quarto document which prints the current python binary path 5) Deploy an Rmarkdown document with a python chunk which prints the python binary path

On Rstudio Connect 2022.06.0 the quarto document uses the wrong python version.

aronatkins commented 1 year ago

@GShotwell - it sounds as if the deployed bundle is not identifying the Quarto content as requiring Python for some reason. That gets communicated to Connect through the manifest.json included with the uploaded source.

The technique I've outlined above uses the rsconnect command-line tool to deploy, and it is able to correctly capture the Python requirements.

Could you share more details about how you are deploying? What commands are you using? Are you using the RStudio IDE? The IDE and the rsconnect R package have received improvements to their Quarto support; could you tell us the IDE version and packageVersion("rsconnect")?

gshotwell commented 1 year ago

Ah okay, I was deploying from the Rstudio IDE publish button I think. We can go ahead and close this and I'll raise a support ticket if it's a problem.

toph-allen commented 1 year ago

I attempted to replicate what you're seeing.

When printing the path to the Python executable in a Python chunk in Quarto document on Connect (using sys.executable, the path appears to point to a Python executable inside Connect's content sandbox (e.g. /opt/rstudio-connect/mnt/app/python/env/bin/python). This is expected, even though in my case the actual executable path resolves to /opt/Python/3.8.1/bin/python.

The same code, when published as R Markdown, will show something like /opt/rstudio-connect/mnt/python-environments/pip/3.8.1/{hash}/bin/python.

As far as I'm aware, this is expected and no cause for concern.

If the Python path you're seeing in Quarto documents isn't inside Connect's sandbox directory (i.e. not inside opt/rstudio-connect/mnt/), that is different from what I've seen here and warrants more investigation.