Open aunz opened 4 years ago
How did you resolve this please, as I am getting the same issue.
Same issue here =/
Same issue here. If model_fu provides functionality of loading model, do me need to load it for every batch?
Same issue here ..!!! Anyone found a solution to this.?
How is this issue solved. same issue here too..
Has anyone found a solution? I'm facing the same issue, the function runs 4 times, it seems like 1 time per GPU available.
Can you show your code? I would like to reproduce it
I am using the prebuilt SageMaker SKLearn container (https://github.com/aws/sagemaker-scikit-learn-container) version 0.20.0. In the entry_point,I include a script which carries out the batch transformation job.
I noticed that the model_fn() was called multiple times in the cloudwatch log
The input_fn() was also called multiple times
Precisely, it's called every 10 minutes.
I used ml.m4.xlarge, BatchStrategy = SingleRecord and SplitType of None. I also used the environmental variable SAGEMAKER_MODEL_SERVER_TIMEOUT = '9999' to overcome the 60s timeout. I expected that the model_fn or input_fn would only be called once, but in this case, they were called multiple times. In the end, the container crashed with "Internal Server Error".
I saw a similar related issue before https://github.com/awslabs/amazon-sagemaker-examples/issues/341 where the model_fn was called on each invocation. But in this case, there is no /invocations, the model_fn, input_fn, predict_fn, and output_fn were called multiple time. In the end, the container crashed with Internal Server Error.