Closed kvishnivetsky closed 3 years ago
i want to use the gpu to decode my chain_model, but do not success, is there some example scrips to help me? or some suggestion?thank u very much!
I think you are not specific enough? You don't say what does "not success" means specifically?
I think the default nnet3 decode.sh can do forward pass/inference on GPU, but the speedup is not great. For that, you would have to use the Nvidia's people contribution, but I cannot recall, if there are example scripts.
On Wed, Nov 25, 2020 at 4:18 AM 刘春平 notifications@github.com wrote:
i want to use the gpu to decode my chain_model, but do not success, is there some example scrips to help me? or some suggestion?thank u very much!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4306#issuecomment-733576137, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYXZKZVSZYRXLN7MWSOLSRTDWJANCNFSM4SZNCS6Q .
I think you are not specific enough? You don't say what does "not success" means specifically? I think the default nnet3 decode.sh can do forward pass/inference on GPU, but the speedup is not great. For that, you would have to use the Nvidia's people contribution, but I cannot recall, if there are example scripts. … On Wed, Nov 25, 2020 at 4:18 AM 刘春平 @.***> wrote: i want to use the gpu to decode my chain_model, but do not success, is there some example scrips to help me? or some suggestion?thank u very much! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#4306 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYXZKZVSZYRXLN7MWSOLSRTDWJANCNFSM4SZNCS6Q .
yes you are right, I used the Nvidia's people contribution ,but have no example scripts, so i tried to code scripts and use the tools from Nvidia's people contribution, that way not successed,so i need some example scripts for Nvidia's people contribution. thank u for your reply
you again not saying what was the error or behavior of "not succeeding" y.
On Wed, Nov 25, 2020 at 7:56 PM 刘春平 notifications@github.com wrote:
I think you are not specific enough? You don't say what does "not success" means specifically? I think the default nnet3 decode.sh can do forward pass/inference on GPU, but the speedup is not great. For that, you would have to use the Nvidia's people contribution, but I cannot recall, if there are example scripts. … <#m-6983103753447765723> On Wed, Nov 25, 2020 at 4:18 AM 刘春平 @.***> wrote: i want to use the gpu to decode my chain_model, but do not success, is there some example scrips to help me? or some suggestion?thank u very much! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#4306 (comment) https://github.com/kaldi-asr/kaldi/issues/4306#issuecomment-733576137>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYXZKZVSZYRXLN7MWSOLSRTDWJANCNFSM4SZNCS6Q .
yes you are right, I used the Nvidia's people contribution ,but have no example scripts, so i tried to code scripts and use the tools from Nvidia's people contribution, that way not successed,so i need some example scripts for Nvidia's people contribution. thank u for your reply
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4306#issuecomment-734013325, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX4ANKCZQYJLG4DBER3SRWRVVANCNFSM4SZNCS6Q .
when i include "cudadecoder/xxx.h" there are errors "some variable not declared" for example, when i do :
error: ‘CuDevice’ has not been declared
when i include "cudadecoder/xxx.h" there are errors "some variable not declared" for example, when i do :
include "cudadecoder/cuda-decoder.h"
error: ‘CuDevice’ has not been declared add note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
If you want to compile against this stuff, it's better if you add a program inside the Kaldi source tree and compile as if it were a Kaldi program. That way you get all the compilation options and flags and #defines.
On Thu, Nov 26, 2020 at 9:39 AM 刘春平 notifications@github.com wrote:
when i include "cudadecoder/xxx.h" there are errors "some variable not declared" for example, when i do :
include "cudadecoder/cuda-decoder.h"
error: ‘CuDevice’ has not been declared add note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4306#issuecomment-734023767, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO25NAIXSBTYMWEG6XTSRWWT3ANCNFSM4SZNCS6Q .
plus you are probably describing two issues -- compiling of something and running of something else -- compilation should not end up with "cudaMalloc" failure y.
On Thu, Nov 26, 2020 at 5:42 AM Daniel Povey notifications@github.com wrote:
If you want to compile against this stuff, it's better if you add a program inside the Kaldi source tree and compile as if it were a Kaldi program. That way you get all the compilation options and flags and #defines.
On Thu, Nov 26, 2020 at 9:39 AM 刘春平 notifications@github.com wrote:
when i include "cudadecoder/xxx.h" there are errors "some variable not declared" for example, when i do :
include "cudadecoder/cuda-decoder.h"
error: ‘CuDevice’ has not been declared add note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4306#issuecomment-734023767, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAZFLO25NAIXSBTYMWEG6XTSRWWT3ANCNFSM4SZNCS6Q
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4306#issuecomment-734070165, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYXZBETPZWJNE5TZMJDDSRXMCPANCNFSM4SZNCS6Q .
plus you are probably describing two issues -- compiling of something and running of something else -- compilation should not end up with "cudaMalloc" failure y. On Thu, Nov 26, 2020 at 5:42 AM Daniel Povey notifications@github.com wrote: … If you want to compile against this stuff, it's better if you add a program inside the Kaldi source tree and compile as if it were a Kaldi program. That way you get all the compilation options and flags and #defines. On Thu, Nov 26, 2020 at 9:39 AM 刘春平 @.***> wrote: > when i include "cudadecoder/xxx.h" there are errors "some variable not > declared" > for example, when i do : > #include "cudadecoder/cuda-decoder.h" > error: ‘CuDevice’ has not been declared > add note: (if you use ‘-fpermissive’, G++ will accept your code, but > allowing the use of an undeclared name is deprecated) > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#4306 (comment)>, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AAZFLO25NAIXSBTYMWEG6XTSRWWT3ANCNFSM4SZNCS6Q > > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#4306 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYXZBETPZWJNE5TZMJDDSRXMCPANCNFSM4SZNCS6Q .
Hi @jtrmal , cudaMalloc was MY issue. And I do not know why @dgxlsir is writing here - he has really "another issue".
Ah, two different people. Sorry my bad, I didn't check. Y.
On Wed, Dec 2, 2020 at 11:36 Konstantin S. Vishnivetsky < notifications@github.com> wrote:
plus you are probably describing two issues -- compiling of something and running of something else -- compilation should not end up with "cudaMalloc" failure y. On Thu, Nov 26, 2020 at 5:42 AM Daniel Povey notifications@github.com wrote: … <#m7038906284015639827> If you want to compile against this stuff, it's better if you add a program inside the Kaldi source tree and compile as if it were a Kaldi program. That way you get all the compilation options and flags and
defines. On Thu, Nov 26, 2020 at 9:39 AM 刘春平 @.***> wrote: > when i
include "cudadecoder/xxx.h" there are errors "some variable not > declared"
for example, when i do : > #include "cudadecoder/cuda-decoder.h" > error: ‘CuDevice’ has not been declared > add note: (if you use ‘-fpermissive’, G++ will accept your code, but > allowing the use of an undeclared name is deprecated) > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#4306 (comment) https://github.com/kaldi-asr/kaldi/issues/4306#issuecomment-734023767>, or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AAZFLO25NAIXSBTYMWEG6XTSRWWT3ANCNFSM4SZNCS6Q
. > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#4306 (comment) https://github.com/kaldi-asr/kaldi/issues/4306#issuecomment-734070165>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYXZBETPZWJNE5TZMJDDSRXMCPANCNFSM4SZNCS6Q .
Hi @jtrmal https://github.com/jtrmal , cudaMalloc was MY issue. And I do not know why @dgxlsir https://github.com/dgxlsir is writing here - he has really "another issue".
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4306#issuecomment-737143725, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYXYKL4TTH3CP7QVFG4DSSYKBBANCNFSM4SZNCS6Q .
Hi, Guys.
Is there any progress in my issue with batched-wav-nnet3-cuda2 sometimes unable to allocale CUDA memory ? (initial issue topic)
This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.
ping
I merged your PR that you said fixed a segfault. It's normal for these kinds of programs to require some configuration adjustments to run on different hardware and with (e.g.) different graphs. At this point there's not enough detail to say that it's a bug or something that requires fixing.
Thanks for merging - I remember that. But PR fixes only "uncontrolled" behaviour in mem operations and does NOT fix a "real cause" of this behaviour. What kind of information do you need to find out a "real cause" of this "cudaMalloc() error message: an illegal memory access was encountered" issue?
it says "an illegal memory access was encountered"? That is not normal. I would probably try to run in cuda-gdb or cuda-memcheck and see if it finds where there is (e.g.) out of bounds access.
This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.
I am closing this issue because of inactivity. @kvishnivetsky, if you can repro and provide cuda-memcheck data, please ping me, and I'll reopen it. @-mention me for a faster response!
Kaldi version: 5-5.636 CUDA support: yes Driver Version: 440.95.01 CUDA Version: 10.2 OS: CentOS 7 x64 Virtualization: openVZ NVIDIA Hardware: 2 x NVIDIA Tesla T4
All strated from unpredictable Segmentation Faults. After applying patch: https://github.com/kaldi-asr/kaldi/pull/4305 We found out an error in
HostDeviceVector::Reallocate
method atbatched-threaded-nnet3-cuda-pipeline2.h:159
cudaMalloc()
error message:an illegal memory access was encountered