Closed yutkin closed 3 years ago
You should probably just use multiple processes if you want parallel executions. I don't think it's intended for this to be thread safe.
If needed, thread safety can be reached by the protection of critical sections. Actually, in my example, it handles only one request, there are no concurrent requests. It fails only when I run "forward" in a separate thread.
FWIW I run this stack against MKL instead of CUDA just fine with resources created and consumed in different threads (but only one thread using resources at a time)
You should attach a debugger and print the thread stacks to see where it's frozen.
Also, try this for your forward pass instead of your W2lDataset code: https://github.com/talonvoice/wav2letter/blob/decoder/w2l_forward.cpp#L64
The W2lDataset code does a lot of gnarly stuff internally, for example here's a threadpool: https://github.com/facebookresearch/wav2letter/blob/master/src/data/W2lDataset.cpp#L30
It is very difficult to embed process-based parallelism in our current app, therefore I need to find a solution for using the network from different threads.
closing due to inactivity and huge improvements in our codebase since that time.
I'm trying to embed decoder in gRPC service and use it simultaneously in concurrent requests. All works fine when I have a sequential version (i.e. only one request per time). But when I try to run each decoding request in a separate thread, network freezes on the forward pass.
gRPC server
server.h
server.cc
Decoder
model.h
model.cc
Example output:
At this stage process is freezes. If run one-threaded version, array prints normally and forward and decoding pass successfully.
Why does execution freeze in a multri-threaded version?