nvidia-riva / tutorials

NVIDIA Riva runnable tutorials
114 stars 46 forks source link

Running asr-python-basics.ipynb code crashed Nvidia Riva container #85

Closed grafiszti closed 1 year ago

grafiszti commented 1 year ago

I deployed Nvidia Riva on the remote machine using instructions from this quick start guide using the version nvidia/riva/riva_quickstart:2.6.0 of the quickstart.

I was trying to run the asr-python-basics.ipynb notebook, the prediction worked, but the container crashed. Code for reproduction:

import io
import IPython.display as ipd
import grpc
import riva.client

auth = riva.client.Auth(uri='localhost:50051')
riva_asr = riva.client.ASRService(auth)

path = "./audio_samples/en-US_sample.wav"
with io.open(path, 'rb') as fh:
    content = fh.read()
ipd.Audio(path)

config = riva.client.RecognitionConfig()
config.language_code = "en-US"                    
config.max_alternatives = 1                       
config.enable_automatic_punctuation = True        
config.audio_channel_count = 1                    

response = riva_asr.offline_recognize(content, config)
asr_best_transcript = response.results[0].alternatives[0].transcript
print("ASR Transcript:", asr_best_transcript)
print("\n\nFull Response Message:")
print(response)

Error from the Riva container logs:

I1024 14:07:25.004743 91 grpc_server.cc:4544] Started GRPCInferenceService at 0.0.0.0:8001
I1024 14:07:25.004979 91 http_server.cc:3242] Started HTTPService at 0.0.0.0:8000
I1024 14:07:25.045824 91 http_server.cc:180] Started Metrics Service at 0.0.0.0:8002
  > Triton server is ready...
I1024 14:07:25.194975   403 riva_server.cc:120] Using Insecure Server Credentials
I1024 14:07:25.198563   403 model_registry.cc:110] Successfully registered: citrinet-1024-en-US-asr-offline for ASR
I1024 14:07:25.202152   403 model_registry.cc:110] Successfully registered: citrinet-1024-en-US-asr-streaming for ASR
I1024 14:07:25.205432   403 model_registry.cc:110] Successfully registered: conformer-en-US-asr-offline for ASR
I1024 14:07:25.208616   403 model_registry.cc:110] Successfully registered: conformer-en-US-asr-streaming for ASR
I1024 14:07:25.272111   403 model_registry.cc:110] Successfully registered: riva-punctuation-en-US for NLP
I1024 14:07:25.277859   403 model_registry.cc:110] Successfully registered: riva_intent_weather for NLP
I1024 14:07:25.278462   403 model_registry.cc:110] Successfully registered: riva_ner for NLP
I1024 14:07:25.279049   403 model_registry.cc:110] Successfully registered: riva_qa for NLP
I1024 14:07:25.279526   403 model_registry.cc:110] Successfully registered: riva_text_classification_domain for NLP
I1024 14:07:25.603746   403 model_registry.cc:110] Successfully registered: riva-punctuation-en-US for NLP
I1024 14:07:25.609462   403 model_registry.cc:110] Successfully registered: riva_intent_weather for NLP
I1024 14:07:25.610060   403 model_registry.cc:110] Successfully registered: riva_ner for NLP
I1024 14:07:25.610651   403 model_registry.cc:110] Successfully registered: riva_qa for NLP
I1024 14:07:25.611116   403 model_registry.cc:110] Successfully registered: riva_text_classification_domain for NLP
I1024 14:07:25.628804   403 model_registry.cc:110] Successfully registered: fastpitch_hifigan_ensemble-English-US for TTS
I1024 14:07:25.644143   403 riva_server.cc:160] Riva Conversational AI Server listening on 0.0.0.0:50051
W1024 14:07:25.644161   403 stats_reporter.cc:41] No API key provided. Stats reporting disabled.
W1024 14:07:26.005081 91 metrics.cc:426] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W1024 14:07:26.005117 91 metrics.cc:444] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W1024 14:07:26.005121 91 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W1024 14:07:27.005261 91 metrics.cc:426] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W1024 14:07:27.005292 91 metrics.cc:444] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W1024 14:07:27.005296 91 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W1024 14:07:28.006394 91 metrics.cc:426] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W1024 14:07:28.006411 91 metrics.cc:444] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W1024 14:07:28.006415 91 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
I1024 14:19:04.273377   410 grpc_riva_asr.cc:484] ASRService.Recognize called.
I1024 14:19:04.273463   410 riva_asr_stream.cc:214] Detected format: encoding = 1 numchannels = 1 samplerate = 16000 bitspersample = 16
I1024 14:19:04.273468   410 grpc_riva_asr.cc:550] ASRService.Recognize performing streaming recognition with sequence id: 1093626779
I1024 14:19:04.273550   410 grpc_riva_asr.cc:580] Using model citrinet-1024-en-US-asr-offline for inference
I1024 14:19:04.273597   410 grpc_riva_asr.cc:595] Model sample rate= 16000 for inference
terminate called after throwing an instance of 'std::runtime_error'
  what():  punct_logits: failed to perform CUDA copy: invalid argument
Signal (6) received.
 0# 0x000056392435A7E9 in tritonserver
 1# 0x00007F011980A0C0 in /usr/lib/x86_64-linux-gnu/libc.so.6
 2# gsignal in /usr/lib/x86_64-linux-gnu/libc.so.6
 3# abort in /usr/lib/x86_64-linux-gnu/libc.so.6
 4# 0x00007F0119BC3911 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 5# 0x00007F0119BCF38C in /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 6# 0x00007F0119BCF3F7 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 7# 0x00007F0119BCF37F in /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 8# 0x00007F0085B76B1E in /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so
 9# 0x00007F0085B88A1C in /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so
10# 0x00007F0085B38CC2 in /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so
11# 0x00007F0085B38B34 in /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so
12# 0x00007F0085C34B1F in /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so

E1024 14:19:10.933988  1386 client_object.cc:116] error: failed to do inference: Socket closed
I1024 14:19:10.934067  1386 grpc_riva_asr.cc:243] Could not get punctuated transcript from punctuator model for transcript "what is natural language processing", adding basic punctuation
I1024 14:19:10.935387   410 grpc_riva_asr.cc:664] ASRService.Recognize returning OK
/opt/riva/bin/start-riva: line 55:    91 Aborted                 (core dumped) ${CUSTOM_TRITON_ENV} tritonserver --log-verbose=0 --strict-model-config=true $model_repos --cuda-memory-pool-byte-size=0:1000000000
One of the processes has exited unexpectedly. Stopping container.
W1024 14:19:16.092974   403 riva_server.cc:184] Signal: 15
grafiszti commented 1 year ago

The problem seems to be with the punctuation model, when I turned it off in config:

config.enable_automatic_punctuation = False       # Add punctuation when end of VAD detected

The results are returned and the container doesn't crash.

virajkarandikar commented 1 year ago

@grafiszti Can you try deploying only Citrinet and punctuation models ?

virajkarandikar commented 1 year ago

Closing issue due to lack of activity. Please re-open the issue if you would like to follow up with this issue.