Closed YimengZhu closed 4 years ago
import torch
from warprnnt_pytorch import RNNTLoss
rnnt_loss = RNNTLoss()
acts = torch.FloatTensor([[[[0.1, 0.6, 0.1, 0.1, 0.1],
[0.1, 0.1, 0.6, 0.1, 0.1],
[0.1, 0.1, 0.2, 0.8, 0.1]],
[[0.1, 0.6, 0.1, 0.1, 0.1],
[0.1, 0.1, 0.2, 0.1, 0.1],
[0.7, 0.1, 0.2, 0.1, 0.1]]]])
labels = torch.IntTensor([[1, 2]])
act_length = torch.IntTensor([2])
label_length = torch.IntTensor([2])
loss = rnnt_loss(acts, labels, act_length, label_length)
print(loss) # tensor([4.4957])
It works well in my environment: g++ (GCC) 5.4.0 Python 3.7.4 torch: 1.3.0 (from conda) cuda: 10.0.130
Please try to install pytorch from anaconda and compile this library with gcc >= 4.9.
Thanks very much for the quick reply.
I rebuilt it with gcc 5.2 in conda. Unfortunately, this time can't even run binary tests.
(56) yzhu@i13hpc56:~/warp-transducer/build$ ./test_cpu
./test_cpu: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by ./test_cpu)
./test_cpu: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by ./test_cpu)
./test_cpu: /usr/lib/x86_64-linux-gnu/libgomp.so.1: version `GOMP_4.0' not found (required by ./libwarprnnt.so)
Is this possibly related to gcc 5.2 version? Could you please share from which conda channel you installed gcc 5.4?
Thanks again!
It seems that your system library path /usr/lib/x86_64-linux-gnu
is related to gcc 4.8.
Please use the lib path in conda: export LD_LIBRARY_PATH=/path_to_your_conda_env/lib
.
Sorry after trying 3 days without any progress I have to disturb again. I don't mean to trouble you this much and this even might be a very silly question.
I tried export LD_LIBRARY_PATH=/path_to_your_conda_env/lib
, however, it still gives me the following error:
./test_cpu: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by ./test_cpu)
./test_cpu: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by ./test_cpu)
This time seems libgomp.so.1
linked to the correct lib in conda library, but libstdc++.so.6
did not.
I checked the libstdc++.so.6
in conda lib with strings libstdc++.so.6 | grep GLIBCXX_ CXXABI_
and found the correct version is there.
Following is my LD_LIBRARY_PATH:
(56) yzhu@i13hpc56:~/warp-transducer/build$ echo $LD_LIBRARY_PATH
.:/usr/local/cuda-9.2/lib64:/usr/local/cuda-9.2/extras/CUPTI/lib64:/home/yzhu/anaconda3/envs/56/lib/
Any suggestions?
Thanks very much.
There is a similar problem here.
With conda lib path in LD_LIBRARY_PATH
, I can always run the program successfully.
Thanks for the reply.
Update: I successfully compiled it with gcc 5.2 and passed the tests. However, I still face segmentation fault...
Thanks for the reply.
Update: I successfully compiled it with gcc 5.2 and passed the tests. However, I still face segmentation fault...
Try using smaller batch_size when training model, I ran into the same problem using tensorflow
Thanks for the reply. Update: I successfully compiled it with gcc 5.2 and passed the tests. However, I still face segmentation fault...
Try using smaller batch_size when training model, I ran into the same problem using tensorflow
Thanks very much for the hint.
I've switched to another library though but still interested in how you discovered the solution of small batch size. Could you please share your experience of debugging, especially how you can debug the c++ extension library in python binding. Did you use some tools like gdb etc.?
The recent pull request #64 may fix this issue.
Hi,
I'm facing a segmentation fault as in this issue but in pytorch binding. In my case, all the built binary test can be passed, however, using it in pytorch gives me segfault.
Is there any way I can fix it?
Thanks!