huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.87k stars 27.2k forks source link

Amazon Sagemaker deployment issue for FLAN-T5 model family #20038

Closed BalazsFeherUK closed 2 years ago

BalazsFeherUK commented 2 years ago

System Info

transformers_version='4.17.0', pytorch_version='1.10.2', py_version='py38',

Who can help?

@ArthurZucker

Information

Tasks

Reproduction

Using the deployment script for Amazon Sagemaker as described on the FLAN-T5 model cards (e.g. google/flan-t5-small):

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

role = sagemaker.get_execution_role()

hub = {
'HF_MODEL_ID':'google/flan-t5-small',
'HF_TASK':'text2text-generation'
}

huggingface_model = HuggingFaceModel(
transformers_version='4.17.0',
pytorch_version='1.10.2',
py_version='py38',
env=hub,
role=role,
)

predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.m5.xlarge' # ec2 instance type
)

predictor.predict({
'inputs': "The answer to the universe is"
})

I receive the following error:

ModelError                                Traceback (most recent call last)
<ipython-input-10-eb84f66e23d1> in <module>
     25 
     26 predictor.predict({
---> 27         'inputs': "The answer to the universe is"
     28 })

/opt/conda/lib/python3.7/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model, target_variant, inference_id)
    159             data, initial_args, target_model, target_variant, inference_id
    160         )
--> 161         response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
    162         return self._handle_response(response)
    163 

/opt/conda/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    510                 )
    511             # The "self" in this scope is referring to the BaseClient.
--> 512             return self._make_api_call(operation_name, kwargs)
    513 
    514         _api_call.__name__ = str(py_operation_name)

/opt/conda/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    917             error_code = parsed_response.get("Error", {}).get("Code")
    918             error_class = self.exceptions.from_code(error_code)
--> 919             raise error_class(parsed_response, operation_name)
    920         else:
    921             return parsed_response

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "\u0027T5LayerFF\u0027 object has no attribute \u0027config\u0027"
}

Expected behavior

Model shall work when deployed on Sagemaker Studio.

sgugger commented 2 years ago

cc @philschmid

philschmid commented 2 years ago

Hello @BalazsFeherUK,

It seems that T5-FLAN/ T5LayerFF is not yet supported in transformers==4.17.0. You would need to update the transformers version to be able to use the model. You can check the forum on how you would do this: https://discuss.huggingface.co/t/deploying-open-ais-whisper-on-sagemaker/24761/9