Closed kevin-yauris closed 4 years ago
@laurenyu @ajaykarpur Hi, can you guys kindly respond to this issue please? Me and my friend spend several days trying to figure this thing out but we can't find any reference or answer yet. Thanks in advance.
looking at this part of the documentation that you linked - https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/deploying_tensorflow_serving.html#creating-predictor-instances-for-different-models -
have you tried something like:
predictor = TensorFlowPredictor("test-tf-mme", model_name="model1")
predictor.predict(classification_input)
(replace variables/strings as appropriate - I tried to guess based on what you pasted above)
Hi @laurenyu, thank you for answering. I've tried what you suggest and still got some error
633 error_code = parsed_response.get("Error", {}).get("Code")
634 error_class = self.exceptions.from_code(error_code)
--> 635 raise error_class(parsed_response, operation_name)
636 else:
637 return parsed_response
ValidationError: An error occurred (ValidationError) when calling the InvokeEndpoint operation: Request 186633a1-9a9f-4557-8e2b-038d9aa4bba4 is missing a target model header, which is required to invoke multi-model endpoint test-tf-mme.
by the way, do you know how to use Multi-model interfaces explained in this readme section? After creating a multi-model endpoint how to use these interfaces?
looked a little deeper - @ajaykarpur please correct me if I'm wrong, but it looks like the TensorFlowModel.predict()
method is missing some of the args supported by the generic Predictor.predict()
method.
Based on this line of code, here's my guess at a workaround:
predictor.predict(classification_input, initial_args={"target_model": "model1"})
by the way, do you know how to use Multi-model interfaces explained in this readme section? After creating a multi-model endpoint how to use these interfaces?
I seem to recall there being a way to get a direct URL to the endpoint, but I'm not finding the documentation at the moment. (sorry, it's been awhile since I've worked on this stuff...π )
Hi @laurenyu thank you for your suggestion. I have tried your suggestion but it seems we need to use TargetModel
instead of target_model
as dict key. so I tried it with predictor.predict(classification_input, initial_args={"TargetModel": "model1.tar.gz"})
, but still encounter similar error with what I have tried before
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "<html>
<head>
<title>Internal Server Error</title>
</head>
<body>
<h1><p>Internal Server Error</p></h1>
</body>
</html>
". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/test-tf-mme in account 682361690817 for more information.
CloudWatch log
2020-09-15 03:50:15.924080: W tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:267] No versions of servable ed8030f1549d2b8c0d9fc09a3d3cd31c found under base path /opt/ml/models/ed8030f1549d2b8c0d9fc09a3d3cd31c/model
I seem to recall there being a way to get a direct URL to the endpoint, but I'm not finding the documentation at the moment. (sorry, it's been awhile since I've worked on this stuff...π )
We can get the URL to the endpoint from Amazon Sagemaker dashboard, it is something like this https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/test-tf-mme/invocations
. If I can get the URL, how to use the interfaces with this URL?
There is a link in the dashboard below the URL: Learn more about the API. But it just explains how to invoke endpoint and do prediction, no documentation on how to load model and unload model.
No versions of servable ed8030f1549d2b8c0d9fc09a3d3cd31c found under base path /opt/ml/models/ed8030f1549d2b8c0d9fc09a3d3cd31c/model
what does your model.tar.gz
look like?
We can get the URL to the endpoint from Amazon Sagemaker dashboard, it is something like this https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/test-tf-mme/invocations. If I can get the URL, how to use the interfaces with this URL?
There is a link in the dashboard below the URL: Learn more about the API. But it just explains how to invoke endpoint and do prediction, no documentation on how to load model and unload model.
/invocations
is what's added from InvokeEndpoint
, so my guess would be that https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/test-tf-mme
would be your base URL. (I've never tried it, though.)
hi @laurenyu, thank you for your help!
We already successfully using TFS with Multi-model Endpoint. It was because an invalid model structure.
Previously my model1.tar.gz
look like this
βββ model1.tar.gz
βββ model1
βββ <version number>
βββ saved_model.pb
βββ variables
βββ ...
after changing it to:
βββ model1.tar.gz
βββ <version number>
βββ saved_model.pb
βββ variables
βββ ...
it is working now. I leave a snippet of our code, just in case someone encounter this trouble too.
container = {
'Image': image,
'ModelDataUrl': model_data_location,
'Mode': 'MultiModel'
}
sagemaker_client = boto3.client('sagemaker')
# Create Model
response = sagemaker_client.create_model(
ModelName = model_name,
ExecutionRoleArn = role,
Containers = [container])
# Create Endpoint Configuration
response = sagemaker_client.create_endpoint_config(
EndpointConfigName = endpoint_configuration_name,
ProductionVariants=[{
'InstanceType': 'ml.t2.medium',
'InitialInstanceCount': 1,
'InitialVariantWeight': 1,
'ModelName': model_name,
'VariantName': 'AllTraffic'}])
# Create Endpoint
response = sagemaker_client.create_endpoint(
EndpointName = endpoint_name,
EndpointConfigName = endpoint_configuration_name)
# Invoke Endpoint
sagemaker_runtime_client = boto3.client('sagemaker-runtime')
content_type = "application/json" # The MIME type of the input data in the request body.
accept = "application/json" # The desired MIME type of the inference in the response.
payload = json.dumps({"instances": [1.0, 2.0, 5.0]}) # Payload for inference.
target_model = 'model1.tar.gz'
response = sagemaker_runtime_client.invoke_endpoint(
EndpointName=endpoint_name,
ContentType=content_type,
Accept=accept,
Body=payload,
TargetModel=target_model,
)
response
@kevin-yauris I am trying to build a multi model endpoint using the locally trained model artefacts (.pb and variable files). Could you tell me how to figure out the {version number} while creating the .tar file ? Thanks in advance
for the version number, I just use 1
or 2
. I think it doesn't matter as long as the newest version has the highest version number.
Hi, first I want to thank you for developing this. I would like to use a multi-model endpoint with Sagemaker Tensorflow Serving Container, but there are some things that confuse me. Any feedback or answer will be really appreciated.
What did you find confusing? Please describe. It seems that there are 2 different definitions of a multi-model endpoint. One is using Multi-Model Server Library (let's call this general multi-model endpoint) and the other is the one described in Sagemaker Tensorflow: Deploying Tensorflow Serving - Deploying more than one model to your endpoint (let's call this one TFS multi-model endpoint). I am confused because both of them use the same term multi-model endpoint but seems like a different feature since both used differently. Is the TFS multi-model can use method or interface enabled in general multi-model endpoint? In general multi-model endpoint, there is documentation to add and remove models from a multi-model enpoint. Can a model build using
TensorFlowModel
deployed into a general multi-model endpoint and use this?What that I want to know is did Sagemaker Tensorflow Serving support load, unload and update the model without creating a new endpoint. In this readme section there are some interfaces to do this, but I can't find a way to access it through SDK or creating a request into the endpoint URL.
Describe how documentation can be improved I think it will be great if the documentation explains the difference or connection of the TFS multi-model endpoint and general multi-model endpoint. The other that I would to request is a tutorial (a Jupyter notebook will be great) of how to unload, load a new model, and update a model in a TFS multi-model endpoint, the current readme section explains that there are some interfaces but doesn't explain how to use it if the container is deployed on an endpoint.
Additional context I have tried to deploy a
TensorflowModel
as aMultiDataModel
by following Multi-Model Endpoint XGBoost Sample Notebook example but it seems there is some error when using it do to prediction.The error is:
If the target model is not give by using
predictor.predict(classification_input)
, this error shown upValidationError: An error occurred (ValidationError) when calling the InvokeEndpoint operation: Request acd28d7b-c1b4-4ce1-9f06-c1fdefb58cee is missing a target model header, which is required to invoke multi-model endpoint test-tf-mme.
I also tried to invoke endpoint:
this error is shown: