Open bevhanno opened 2 years ago
I solved the problem by adding the following code to the xgboost inference script "sagemaker_xgb_training.py":
def rchop(s, suffix):
if suffix and s.endswith(suffix):
return s[:-len(suffix)]
return s
def input_fn(request_body, request_content_type):
if request_content_type == "text/csv; charset=utf-8":
request_body = request_body.decode('utf-8')
request_body = rchop(request_body, '\n')
return xgb_encoders.csv_to_dmatrix(request_body)
else:
raise ValueError("Content type {} is not supported.".format(request_content_type))
This adds the missing "text/csv; charset=utf-8" content type, decodes the request body and removes ending "\n" characters before calling the xgb_encoder.
I experience same issue, but while using sklearn-preprossesor -> LGMB pipeline.
@Kyparos Did you resolve your issue ? Even I am facing the same problem with sklearnPreprocessor -> sagemaker LGMB
I believe the problem is due to Pipeline Models honoring input and output content types but not in-between.
I.e. when you start a Batch Transform, you can set it to input content type to be text/csv and output content type to be text/csv. However, this will not set output content type to text/csv on the first container (you can verify by logs), therefore it will revert back to application/json, thus making the 2nd container fail.
Hi, I have created a Sagemaker Pipeline Model using an Sklearn model followed by an xgboost model. I followed the instructions here to set the 'SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT' environment variable but I'm getting an
ValueError: Content type text/csv; charset=utf-8
is not supported. error when running the batch transform job on the 2nd (xgboost) container that is following the sklearn container.My pipeline code looks as following:
As inference output code of container 1 (sklearn) I am using:
As inference input code of container 2 (xgb) I am using:
It seems like even though I am forcing the output content type of container 1 to "text/csv", what is arriving in container 2 is an unkown "text/csv; charset=utf-8" format. Any ideas of what I am doing wrong ?
Thank you for your help!