Describe the bug
When running instance_type="local" the DataProcessing is not used, and all the input data is sent to the prediction
In the _perform_batch_inference function in the entities.py file there is no use of the DataProcessing key from the kwargs. - so the input_data / item is sent as-is without any filtering .
To reproduce
from sagemaker.model import Model
from sagemaker.local import LocalSession
import boto3
model = Model(
model_data='file://to/my/model_data',
role='MY_ROLE',
image_uri='IMAGE_URI',
sagemaker_session= LocalSession(boto3.Session(region_name='my-region'))
)
transformer = model.transformer(
instance_count=1,
instance_type="local",
strategy="MultiRecord",
assemble_with="Line",
output_path="file://my/output/path",
accept="text/csv",
max_concurrent_transforms=1,
)
transformer.transform(
data="file://path/to/my/data/file",
content_type="text/csv",
split_type="Line",
input_filter="$[4]", # this currently seams not to be working in local mode
join_source="Input",
output_filter="$[0]",
)
transformer.wait()
Expected behavior
The input csv should be filtered using the input_filter value.
Also the the output
System information
A description of your system. Please provide:
SageMaker Python SDK version:
sagemaker==2.219.0
Framework name (eg. PyTorch) or algorithm (eg. KMeans):
Describe the bug When running
instance_type="local"
theDataProcessing
is not used, and all the input data is sent to the predictionIn the _perform_batch_inference function in the
entities.py
file there is no use of theDataProcessing
key from thekwargs
. - so the input_data / item is sent as-is without any filtering .To reproduce
Expected behavior The input csv should be filtered using the
input_filter
value. Also the the outputSystem information A description of your system. Please provide:
SageMaker Python SDK version:
Framework name (eg. PyTorch) or algorithm (eg. KMeans):
Python version: 3.10.14
Custom Docker image (Y/N):