rstudio / vetiver-r

Version, share, deploy, and monitor models
https://rstudio.github.io/vetiver-r/
Other
181 stars 27 forks source link

Dockerfile does not work out of the box (port not exposed, no access to the board) #101

Closed ggpinto closed 2 years ago

ggpinto commented 2 years ago

Here are a reprex and version that works

library(vetiver)
library(pins)
#
# Board lives inside the project for this reprex only
#
b <- board_folder("board",versioned = TRUE)
cars_lm <- lm(mpg ~ ., data = mtcars)
v <- vetiver_model(cars_lm, "cars_linear")
vetiver_pin_write(b, v)
#> Creating new version '20220607T064438Z-b509c'
#> Writing to pin 'cars_linear'
#> 
#> Create a Model Card for your published model
#> * Model Cards provide a framework for transparent, responsible reporting
#> * Use the vetiver `.Rmd` template as a place to start
vetiver_write_plumber(b, "cars_linear")
vetiver_write_docker(v)
#> * Lockfile written to 'vetiver_renv.lock'.

This Dockerfile doesn't work because it doesn't expose the port 8000 and assumes that the container has access to the board (see plumber.R)

cat(readr::read_lines("Dockerfile"), sep = "\n")
#> # Generated by the vetiver package; edit with care
#> 
#> FROM rocker/r-ver:4.2.0
#> ENV RENV_CONFIG_REPOS_OVERRIDE https://packagemanager.rstudio.com/cran/latest
#> 
#> RUN apt-get update -qq && apt-get install -y --no-install-recommends \
#>   git \
#>   libcurl4-openssl-dev \
#>   libgit2-dev \
#>   libicu-dev \
#>   libsodium-dev \
#>   libssl-dev \
#>   make
#> 
#> COPY vetiver_renv.lock renv.lock
#> RUN Rscript -e "install.packages('renv')"
#> RUN Rscript -e "renv::restore()"
#> COPY plumber.R /opt/ml/plumber.R
#> 
#> ENTRYPOINT ["R", "-e", "pr <- plumber::plumb('/opt/ml/plumber.R'); pr$run(host = '0.0.0.0', port = 8000)"]
cat(readr::read_lines("plumber.R"), sep = "\n")
#> # Generated by the vetiver package; edit with care
#> 
#> library(pins)
#> library(plumber)
#> library(rapidoc)
#> library(vetiver)
#> b <- board_folder(path = "board")
#> v <- vetiver_pin_read(b, "cars_linear", version = "20220607T064438Z-b509c")
#> 
#> #* @plumber
#> function(pr) {
#>     pr %>% vetiver_api(v)
#> }

Created on 2022-06-07 by the reprex package (v2.0.1)

This a Dockerfile that works for me (expose the port and copy the board to the container):

# Generated by the vetiver package; edit with care

FROM rocker/r-ver:4.2.0
ENV RENV_CONFIG_REPOS_OVERRIDE https://packagemanager.rstudio.com/cran/latest

RUN apt-get update -qq && apt-get install -y --no-install-recommends \
  git \
  libcurl4-openssl-dev \
  libgit2-dev \
  libicu-dev \
  libsodium-dev \
  libssl-dev \
  make

COPY vetiver_renv.lock renv.lock
RUN Rscript -e "install.packages('renv')"
RUN Rscript -e "renv::restore()"
COPY plumber.R /opt/ml/plumber.R

# Copy the board to the container (might be overkill if there are many versions
# of the model)
COPY board/ /opt/ml/board

# Expose the port 8000
EXPOSE 8000

ENTRYPOINT ["R", "-e", "pr <- plumber::plumb('/opt/ml/plumber.R'); pr$run(host = '0.0.0.0', port = 8000)"]
juliasilge commented 2 years ago

Thank you so much for checking out vetiver @ggpinto! There are two issues here, the pins board and EXPOSE.

The decision to keep the binary model object outside the Docker container is entirely purposeful. The model needs to be versioned and located outside the container, for a more robust and flexible workflow. You are totally free to edit the Dockerfile and do this if it fits your particular use case better, but we consider this an "off label" use. You can see an example here that authenticates a Docker container to a pins board.

I am open to the idea of adding EXPOSE if it is useful to people, but just to clarify, it is not necessary to build or run the container. The EXPOSE command does not change anything about how you access the container; it is only useful as documentation and these Dockerfiles already include the port in the text. You can read more here:

I am pretty neutral about adding this if it is useful to folks as documentation (I am never against more clear documentation!) but it will be good to understand that it will not actually expose a port.

ggpinto commented 2 years ago

Hi @juliasilge, thank you for your incredible work!

I understood the first issue (pins board), thanks for the explanation! I am reading the book Practical MLOps by Noah Gift and Alfredo Deza and trying to implement using R, the book uses python so I got a bit confused there.

About the second issue, I have read the documentation and understood that EXPOSE will not expose the port. It would be useful for people who run the container using Docker Desktop (at least on Windows). Without EXPOSE I can't assign a port for the local host (and therefore can't use the API):

image

With EXPOSE I can assign a port:

image

The problem doesn't occur when using docker run with -p 8000:8000 on the command line.

juliasilge commented 2 years ago

Thanks again @ggpinto!