juliasilge / juliasilge.com

My blog, built with blogdown and Hugo :link:
https://juliasilge.com/
40 stars 27 forks source link

Use Docker to deploy a model for #TidyTuesday LEGO sets | Julia Silge #76

Open utterances-bot opened 1 year ago

utterances-bot commented 1 year ago

Use Docker to deploy a model for #TidyTuesday LEGO sets | Julia Silge

A data science blog

https://juliasilge.com/blog/lego-sets/

uramith commented 1 year ago

This is awesome!! Thank you very much.

Could you please make a similar post for analytics related work flow. For example, Analysis where ML models are not used/needed but just simple analysis and save analysis output as csv in the cloud platforms such as aws or azure with the help of docker.

Thanks, Amith

wdefreitas commented 1 year ago

Hi Julia,

Thank you for the screencast with docker and vetiver. I was able to follow along with most of the code. I don't have access to rs_connect() so I was trying build the board with board_folder(). The pin is created fine as well as the docker, plumber, and lock file but not for the life of me can I get

docker run --rm -p 8000:8000 lego-set-names

to run properly. I keep getting messages that docker can't find the pin (I have an intel chip MAC, and am local so removed the platform and .Renviron parts). I am trying to navigate to a Box location, but I also tried placing the pin on my desktop and was running into the same issue. Wondering if you might be able to spot what I am doing incorrectly. I'm new to docker and vetiver and most of the examples are showing deployment with rs_connect() which I wish we could do, but for those trying to play with vetiver with a team on a local network there are less examples. (Just as an fyi, for this question, I replaced my MAC username with ... from the path below for privacy)


v <- final_fitted %>%
    extract_workflow() %>%
    vetiver_model(model_name = "lego-sets")
v

board <- board_folder(path = "/Users/.../Library/CloudStorage/Box-Box/pins")   
board %>% vetiver_pin_write(v)

vetiver_write_plumber(board, "lego-sets", rsconnect = FALSE)
vetiver_write_docker(v)

The docker file shows

FROM rocker/r-ver:4.2.1
ENV RENV_CONFIG_REPOS_OVERRIDE https://packagemanager.rstudio.com/cran/latest

RUN apt-get update -qq && apt-get install -y --no-install-recommends \
  libcurl4-openssl-dev \
  libicu-dev \
  libsodium-dev \
  libssl-dev \
  make

COPY vetiver_renv.lock renv.lock
RUN Rscript -e "install.packages('renv')"
RUN Rscript -e "renv::restore()"
COPY plumber.R /opt/ml/plumber.R
EXPOSE 8000
ENTRYPOINT ["R", "-e", "pr <- plumber::plumb('/opt/ml/plumber.R'); pr$run(host = '0.0.0.0', port = 8000)"]

and the plumber file shows

library(pins)
library(plumber)
library(rapidoc)
library(vetiver)
b <- board_folder(path = "/Users/.../Library/CloudStorage/Box-Box/pins")
v <- vetiver_pin_read(b, "lego-sets")

#* @plumber
function(pr) {
    pr %>% vetiver_api(v)
}

Docker build works fine, but when I run

docker run --rm -p 8000:8000 lego-set-names

in the terminal, it returns this

Error in stopOnLine(lineNum, file[lineNum], e) : 
  Error on line #7: 'b <- board_folder(path = "/Users/.../Library/CloudStorage/Box-Box/pins")' - Error in `abort_pin_missing()`:
! Can't find pin called 'lego-sets'
ℹ Use `pin_list()` to see all available pins in this board
Calls: <Anonymous> ... tryCatchList -> tryCatchOne -> <Anonymous> -> stopOnLine
Execution halted

I'm probably doing something really silly, but I can't figure out what is wrong. Any guidance or advice you could provide would be most appreciated. Thank you for all you do for the R community.

juliasilge commented 1 year ago

Oh @wdefreitas I don't think that's silly at all. The thing to focus on is that your Docker container (which is like a separate little computer) needs to be able to get to where your pin is stored. If your pin is outside of the container locally on your computer, the container can't get to it, because those are like two separate computers.

When the pin is stored somewhere like Connect, you can give the Docker container access by giving it the right credentials. You could do something similar with a cloud platform, like using AWS S3. If you really want to do something totally local, just for learning, you could put the pin inside of the Docker container. This isn't a recommended pattern for deployment because having the model artifact separate from the API deployment has so many benefits. However, it does work; take a look at this gist that walks through how to do it.

blechturm commented 1 year ago

Hi Julia,

does the dockerized model also do all the preprocessing steps? Say I have to do PCA at some point, will this reduce the dimensionality of the data I pass to the API internally and then predict based on that?

Thanks for all the educational stuff! I will follow your MLOPs career and hope you still find time to do these videos which are a great help!

Best Max

juliasilge commented 1 year ago

@blechturm If you use tidymodels or scikit-learn to bundle your preprocessing and model estimation together in a workflow or a pipeline, then YES. 👍 We think for typical use cases, the best choice is to deploy these pieces together.

mohamed123hany commented 4 months ago

could you make a video explaining how to deploy a model with shiny

juliasilge commented 4 months ago

@mohamed123hany You may find this demo from Posit Solution Engineering helpful.