Closed wildgeece96 closed 1 year ago
Hi @wildgeece96 .
The np.array
is supposed to be the raw audio waveform in the correct sampling rate, right ?
If so, then it seems the bug comes from somewhere around sagemaker where the numpy array gets converted to a list.
I am tentatively against adding support for lists instead of numpy arrays:
We already have issues when dealing with lists or list of lists or lists of lists of lists (I am not kidding), because a list can mean you are sending several items to be inferred upon, OR the item can consist itself of a list of things (like here numbers), but also a list of list of things (like multi channel audio). np.array
makes the distinction clearer, and avoid a big pitfall when the said lists would be misaligned. A np.array
is a regular tensors, so it comes with more guarantees.
In your particular example, someone is casting a np.array
to a regular list, and that is costly and will unecessarily add overhead to the inference.
That being said there could be workaround probably:
Would using a wav
file work for you ?
Couldn't find better code fast with my google fu, but it's probably doable to create a Wav like buffer with minimal reallocations. Does the sagemaker allow sending raw bytes ? Would that approach work ?
I confirmed inference code like below works
from transformers import pipeline
from transformers.pipelines import AutomaticSpeechRecognitionPipeline
import numpy as np
def model_fn(model_dir) -> AutomaticSpeechRecognitionPipeline:
return pipeline(model="facebook/wav2vec2-base-960h")
def predict_fn(data, pipeline):
inputs = data.pop("inputs", data)
parameters = data.pop("parameters", None)
if type(inputs) == list:
inputs = np.array(inputs, dtype=np.float)
print("inputs are: ", inputs)
# pass inputs with all kwargs in data
if parameters is not None:
prediction = pipeline(inputs, **parameters)
else:
prediction = pipeline(inputs)
return prediction
Thanks @Narsil .
Actually, in my use case, I deployed wav2vec model on SageMaker, and when I send request via SageMaker SDK seriealizer of SageMaker (like JSONSerializer, NumPySerializer) serialize the input to throw request to the endpoint. I should use JSONSerialization to use SageMaker HuggingFace Inference Toolkit and JSONSeiralizer cannot pass ndarray as it is but convert to list.
After reading your comment, the converting logic should be implemented on SageMaker HuggingFace Inference Toolkit because it's specific for SageMaker use case.
Hello @wildgeece96 the automatic-speech-recognition
pipeline is supported. instead of sending numpy data you need to send the audio. Check out this example: https://github.com/huggingface/notebooks/blob/main/sagemaker/20_automatic_speech_recognition_inference/sagemaker-notebook.ipynb
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
transformers
torch
sagemaker
Who can help?
@Narsil @patrickvonplaten
@anton-l
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Run below code on SageMaker.
Returned InternalServerError,
Expected behavior
When I use Transformers on SageMaker, I noticed that Automatic Speech Recognition Pipeline doesn't consider receiving requests when deployed on SageMaker.
When we use SageMaker HuggingFace Inference Toolkit, pipelines will be used for inference.
AutomaticSpeechRecognitionPipeline doesn't accept list as
inputs
parameter for__call__
method and via API the request body is supposed to be like,but I cannot pass ndarray via JSON Serializer I can only pass list.
To solve that problem, pipeline should accept list as inputs.
return like