unable to retrieve IAM credentials in sagemaker serverless inference #737

ncullen93 commented 6 months ago

I am unfortunately having an issue on sagemaker inference, but only on the serverless inference. I am deploying a model using the standard vetiver functions for doing so (ref:, along with some slight changes to the config to be serverless. The vetiver deployment works perfectly with real-time inference but when I change to serverless, it fails because paws can't find any credentials.

I wonder if there is anything special that should be done when building a docker for serverless inference, or if this is just a paws issue. Thanks! Apologies for double posting.

DyfanJones commented 6 months ago

Hi @ncullen93 sorry to hear that. Is it possible for you to include an example in what you did? I would like to reproduce it :)

ncullen93 commented 6 months ago

Of course, thanks! Note that to create the serverless endpoint, I had to alter the endpoint config vetiver creates. So you can either install my fork at ncullen93/vetiver-r or it's probably easier to just skip the vetiver_sm_endpoint function at the end and create the endpoint manually on sagemaker with a serverless config. Both ways fail.


## Fit a basic model
ames_split <-
  ames %>%
  mutate(Sale_Price = log10(Sale_Price)) %>%
  mutate_if(is.integer, as.numeric) %>%
  initial_split(prop = 0.80, strata = Sale_Price)

ames_train <- training(ames_split)
ames_test  <- testing(ames_split)

rf_spec <-
  rand_forest(trees = 1000) %>%
  set_engine("ranger") %>%

rf_wflow <-
    Sale_Price ~ Neighborhood + Gr_Liv_Area + Year_Built + Bldg_Type +
      Latitude + Longitude,

rf_fit <- rf_wflow %>% fit(data = ames_train)

# turn it into a vetiver model
v <- vetiver_model(rf_fit, "ames-pricing")

# write model to s3 board -> need a bucket and credentials here
board <- pins::board_s3(bucket = 'sagemaker-vetiver',
                        access_key = Sys.getenv("AWS_ACCESS_KEY_ID"),
                        secret_access_key = Sys.getenv("AWS_SECRET_ACCESS_KEY"),
                        region = 'us-east-2')
vetiver_pin_write(board, v)


# build docker and upload to ECR -> works fine
new_image_uri <- vetiver_sm_build(board, "ames-pricing")

# create sagemaker model -> works fine
model_name <- vetiver_sm_model(new_image_uri)

# create endpoint -> my fork alters only the config in this function to be serverless
# But it fails when plumber is run due to paws not finding credentials
# install my fork at devtools::install_github('ncullen93/vetiver-r') to see
# You can also skip this + create the serverless endpoint on aws. That fails too
# the 'ml.t2.medium' instance is ignored
new_endpoint <- vetiver_sm_endpoint(model_name, 'ml.t2.medium')
DyfanJones commented 6 months ago

Thanks for the example code :) I will have a little look at it to see why it is failing :)

DyfanJones commented 6 months ago

@ncullen93 nearly forgot do you have any logs or errors?

ncullen93 commented 6 months ago

Yes, nothing too informative I'm afraid.

2024-01-05T02:13:23.894+01:00 | ARGUMENT 'serve' __ignored__
-- | --
  | 2024-01-05T02:13:24.128+01:00 | R version 4.3.1 (2023-06-16) -- "Beagle Scouts"
  | 2024-01-05T02:13:24.128+01:00 | Copyright (C) 2023 The R Foundation for Statistical Computing
  | 2024-01-05T02:13:24.128+01:00 | Platform: x86_64-pc-linux-gnu (64-bit)
  | 2024-01-05T02:13:24.128+01:00 | R is free software and comes with ABSOLUTELY NO WARRANTY.
  | 2024-01-05T02:13:24.128+01:00 | You are welcome to redistribute it under certain conditions.
  | 2024-01-05T02:13:24.128+01:00 | Type 'license()' or 'licence()' for distribution details.
  | 2024-01-05T02:13:24.132+01:00 | Natural language support but running in an English locale
  | 2024-01-05T02:13:24.132+01:00 | R is a collaborative project with many contributors.
  | 2024-01-05T02:13:24.132+01:00 | Type 'contributors()' for more information and
  | 2024-01-05T02:13:24.132+01:00 | 'citation()' on how to cite R or R packages in publications.
  | 2024-01-05T02:13:24.132+01:00 | Type 'demo()' for some demos, 'help()' for on-line help, or
  | 2024-01-05T02:13:24.132+01:00 | 'help.start()' for an HTML browser interface to help.
  | 2024-01-05T02:13:24.132+01:00 | Type 'q()' to quit R.
  | 2024-01-05T02:13:24.396+01:00 | > options('paws.log_level' = 3L); pr <- plumber::plumb('/opt/ml/plumber.R'); pr$run(host = '', port = 8080)
  | 2024-01-05T02:13:26.376+01:00 | INFO [2024-01-05 01:13:26.376]: Unable to locate credentials file
  | 2024-01-05T02:13:26.376+01:00 | INFO [2024-01-05 01:13:26.376]: Unable to locate config file
  | 2024-01-05T02:13:26.376+01:00 | INFO [2024-01-05 01:13:26.376]: Unable to obtain access_key_id, secret_access_key or session_token
  | 2024-01-05T02:13:28.399+01:00 | INFO [2024-01-05 01:13:28.399]: Unable to obtain iam role
  | 2024-01-05T02:13:28.400+01:00 | Error in stopOnLine(lineNum, file[lineNum], e) :
  | 2024-01-05T02:13:28.400+01:00 | Error on line #6: 'library(vetiver)' - Error: No compatible credentials provided.
  | 2024-01-05T02:13:28.400+01:00 | Calls: <Anonymous> ... tryCatchList -> tryCatchOne -> <Anonymous> -> stopOnLine
  | 2024-01-05T02:13:28.400+01:00 | Execution halted
DyfanJones commented 6 months ago

@ncullen93 no worries I will have a look now :)

DyfanJones commented 6 months ago

@ncullen93 do you get a successful endpoint build? and this error only happens when attempting predict?

DyfanJones commented 6 months ago

Interesting I am getting the following error:


I am going to try with the latest dev version of paws.common

ncullen93 commented 6 months ago

Interesting.. seems related to the credentials. I wonder if there is something that must be changed in the docker file. Hard to find any documentation on the difference between real-time and serverless inference from a model perspective.

ncullen93 commented 6 months ago

@ncullen93 do you get a successful endpoint build? and this error only happens when attempting predict?

No, the endpoint build fails. At the very end just like yours.

DyfanJones commented 6 months ago

Hmm I wonder if it is down to paws only looking at the ipv4 for iam credentials and we need to include the support for ipv6 🤔

DyfanJones commented 6 months ago

Found it, we need to support this environmental variable:


This will then get the credentials :D I will have a look in implementing this shortly :D

ncullen93 commented 6 months ago

That sounds promising! Happy to test it whenever.. appreciate the help immensely.

DyfanJones commented 6 months ago

@ncullen93 I believe I have a solution. Please try out:

remotes::install_github("dyfanjones/paws/paws.common", ref = "env_container_cred_full_uri")

And let me know how you get on :)

ncullen93 commented 6 months ago

It works! You are an absolute mad lad.

A little trouble making predictions from the endpoint using paws.machine.learning::sagemakerruntime however. It's giving a 424 model error and the log on aws says the data is empty. So the input data is not getting picked up somehow. This is the code used to invoke an endpoint:

predict.vetiver_endpoint_sagemaker <- function(object, new_data, ...) {
    check_installed(c("jsonlite", "smdocker", "paws.machine.learning"))
    data_json <- jsonlite::toJSON(new_data, na = "string")
    config <- smdocker::smdocker_config()
    sm_runtime <- paws.machine.learning::sagemakerruntime(config)
            resp <- sm_runtime$invoke_endpoint(object$model_endpoint, data_json, ...)
            resp <- resp$Body
        error = function(error) {
            error_code <- error$error_response$ErrorCode
            if (!is.null(error_code) && error_code == "NO_SUCH_ENDPOINT") {
                cli::cli_abort("Model endpoint {.val {object$model_endpoint}} not found.")
    con <- rawConnection(resp)
    resp <- jsonlite::fromJSON(con)

Still, I know the endpoint works because invoking it from python returns predictions. You can see there is a slight addition to the recommended way to invoke a serverless SM endpoint compared with real-time (ContentType is added) so perhaps that's the issue ?

Python example from aws:

response = runtime.invoke_endpoint(
    ContentType=content_type, # this is added for serverless: e.g., "application/json"
    Body=payload # e.g., bytes('__some json__', 'utf-8')

In any case, I will invoke the endpoints from python anyways so not a big deal I think. Really appreciate the help.

DyfanJones commented 6 months ago

That is great news :) I can get this into the latest paws.common 0.7.0 release (#720)

Does adding ContentType work from paws? When invoking the endpoint as well?

Possibly worth raising a pr on vetiver to get the serverless method enabled. @juliasilge would vetiver be interested in extending its sagemaker support with serverless stuff? I am more than happy to contribute again :)

DyfanJones commented 6 months ago

@ncullen93 you could also try sending the data across as a raw vector, so:

data_json <- charToRaw(jsonlite::toJSON(new_data, na = "string"))

Let me know how you get on :)

ncullen93 commented 6 months ago

I will try it. I think that should work.

DyfanJones commented 6 months ago

@ncullen93 Just had a little play and the following worked for me:

predict(new_endpoint, ames_test, ContentType = "application/json")

This is using the standard vetiver:::predict.vetiver_endpoint_sagemaker method.

DyfanJones commented 6 months ago

Note: paws.common 0.7.0 has been released to the cran

juliasilge commented 6 months ago

Thank you so much for your continued support on this @DyfanJones!

Do you think I should set ContentType = "application/json" in the predict method for a SageMaker endpoint? I am thinking yes, since we are definitely passing JSON?

DyfanJones commented 6 months ago

Yeah I agree. If anything we could have it in the parameters for the predict method:

predict.vetiver_endpoint_sagemaker <- function(object, new_data, content_type = "application/json", ...) { }