Open Sshubam opened 2 months ago
I believe I encountered such an issue before, which I believe I fixed with https://github.com/nvidia-riva/riva-asrlib-decoder/commit/a3b5bbdf4a5246962fe463c6453b948e57cb2470 If you're not on the latest version on pypi, please consider updating. However, generally speaking the code is pretty stable by the industry customers using it and I haven't gotten any bug reports in a long time.
It looks like you are using nemo here. Do you feel comfortable sharing your dataset and what model you are using and any code you used to reproduce this? Since these workloads are data-dependent, it can be tricky to diagnose the exact problem without having the data and model and config to reproduce the issue.
I believe I encountered such an issue before, which I believe I fixed with a3b5bbd If you're not on the latest version on pypi, please consider updating. However, generally speaking the code is pretty stable by the industry customers using it and I haven't gotten any bug reports in a long time.
It looks like you are using nemo here. Do you feel comfortable sharing your dataset and what model you are using and any code you used to reproduce this? Since these workloads are data-dependent, it can be tricky to diagnose the exact problem without having the data and model and config to reproduce the issue.
Hey @galv , thanks for the reply. I am not on the latest version, so i think that might be the solution to it, actually, I am not familiar with CUDA, so can you help with what latest function of your library should i replace with [code for v0.2.0]
BatchedMappedDecoderCuda(config, TLG_file, words_file, 129).decode(
logits.to(torch.float32).to("cuda"),
logits_len.to(torch.int64).to("cpu"),
)
I see 3 functions in version 0.4.4 'decode_mbr', 'decode_nbest', 'decode_write_lattice'
Is it decode_nbest? Because this gives me the same error :
ERROR ([5.5]:CopyLaneCountersToHostSync():cudadecoder/cuda-decoder.cc:602) cudaError_t 700 : "an illegal memory access was encountered" returned from 'cudaStreamSynchronize(compute_st_)'
BatchedMappedDecoderCuda(config, TLG_file, words_file, 129).decode_nbest(
logits.to(torch.float32).to("cuda"),
logits_len.to(torch.int64).to("cpu"),
)
Please let me know if you can provide a documentation reference for the same.
So are you aware of any other possible cases where this error could occur?
@galv Hey I found the solution to this issue.. The issue was being caused by number of blank tokens parameter that goes into BatchedMappedDecoderCuda() class. Can you please just let me know what function from the 0.4.4 library should i use because i cannot understand CUDA. Thanks a lot
If you were using decoder() before, you want decode_mbr() in 0.4.4:
instead of "decode" https://github.com/nvidia-riva/riva-asrlib-decoder/blob/39b6a2bd6c8f19c1f390e1e12da1ffeeb2585ba5/src/riva/asrlib/decoder/python_decoder.cc#L281
They should have the same interface.
If you were using decoder() before, you want decode_mbr() in 0.4.4:
instead of "decode"
They should have the same interface.
@galv Im using the new function decode_mbr, but it is still stuck on this never ending loop:
Determinization is guaranteed to terminate on acyclic graphs, which our outputs graphs should always be. If you output a kaldi archive of lattices instead, you could verify that your lattice is acyclic: https://github.com/nvidia-riva/riva-asrlib-decoder/blob/c94c84efb3efb526ce87fa9728a3dd3e621bb484/src/riva/asrlib/decoder/python_decoder.cc#L289
You can try setting the determinize_lattice=False to try to work around it for now.
However, upon further reflection, I think I am familiar with this issue. I believe that CTC models are not suited to phone based determinization because phone determinization depends upon phones having "word boundary information", but we don't have that for CTC models, since they don't use triphones. Are you sure you have updated to a recent version of the library? I had a commit related to that here: https://github.com/nvidia-riva/riva-asrlib-decoder/commit/cdf9cdc4552e65d6d4ed72ba777ef7231e9512bc
[NeMo I 2024-07-17 11:37:08 features:289] PADDING: 0 [NeMo I 2024-07-17 11:37:09 save_restore_connector:249] Model EncDecCTCModelBPE was successfully restored from conformer-or-ctc.nemo. [NeMo I 2024-07-17 11:37:10 collections:196] Dataset loaded with 309 files totalling 8583.33 hours [NeMo I 2024-07-17 11:37:10 collections:197] 0 files were filtered totalling 0.00 hours ERROR ([5.5]:CopyLaneCountersToHostSync():cudadecoder/cuda-decoder.cc:596) cudaError_t 700 : "an illegal memory access was encountered" returned from 'cudaStreamSynchronize(compute_st_)' Aborted (core dumped)
While decoding, I am getting this error.
CUDA info: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Sep_21_10:33:58_PDT_2022 Cuda compilation tools, release 11.8, V11.8.89 Build cuda_11.8.r11.8/compiler.31833905_0
Can anyone help if you have faced this issue before?
@galv