RuntimeError: cuda memcpy or memset failed

jinserk commented 6 years ago

Hi, Thanks for sharing your good project. I've installed successfully with Python 3.6.5 and PyTorch 0.5.0+7ca8e2f under cuda 9.2, and now I face some unknown runtime error as:

RuntimeError: cuda memcpy or memset failed (ctc at src/_warpctc.cpp:106)
frame #0: <unknown function> + 0x13a8a (0x7fee43526a8a in /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc/_warpctc.cpython-36m-x86_64-linux-gnu.so)
frame #1: <unknown function> + 0x11a61 (0x7fee43524a61 in /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc/_warpctc.cpython-36m-x86_64-linux-gnu.so)
frame #2: _PyCFunction_FastCallDict + 0x22d (0x7fef04d008ad in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #3: <unknown function> + 0x164afa (0x7fef04d99afa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #4: _PyEval_EvalFrameDefault + 0x4186 (0x7fef04d9e4a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #5: <unknown function> + 0x1646fe (0x7fef04d996fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #6: PyEval_EvalCodeEx + 0x6d (0x7fef04d99d2d in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #7: <unknown function> + 0xa4c96 (0x7fef04cd9c96 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #8: PyObject_Call + 0x6a (0x7fef04ca78aa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #9: THPFunction_apply(_object*, _object*) + 0x3e8 (0x7fee9167ea88 in /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #10: _PyCFunction_FastCallDict + 0x18e (0x7fef04d0080e in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #11: <unknown function> + 0x164afa (0x7fef04d99afa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #12: _PyEval_EvalFrameDefault + 0x4186 (0x7fef04d9e4a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #13: <unknown function> + 0x163d90 (0x7fef04d98d90 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #14: _PyFunction_FastCallDict + 0x2c6 (0x7fef04da24a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #15: _PyObject_FastCallDict + 0x17e (0x7fef04ca7afe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #16: _PyObject_Call_Prepend + 0xce (0x7fef04ca7bee in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #17: PyObject_Call + 0x6a (0x7fef04ca78aa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #18: _PyEval_EvalFrameDefault + 0x3adf (0x7fef04d9ddff in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #19: <unknown function> + 0x1646fe (0x7fef04d996fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #20: _PyFunction_FastCallDict + 0x165 (0x7fef04da2345 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #21: _PyObject_FastCallDict + 0x17e (0x7fef04ca7afe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #22: _PyObject_Call_Prepend + 0xce (0x7fef04ca7bee in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #23: PyObject_Call + 0x6a (0x7fef04ca78aa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #24: <unknown function> + 0xe8701 (0x7fef04d1d701 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #25: _PyObject_FastCallDict + 0x8b (0x7fef04ca7a0b in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #26: <unknown function> + 0x164868 (0x7fef04d99868 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #27: _PyEval_EvalFrameDefault + 0x4186 (0x7fef04d9e4a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #28: <unknown function> + 0x163d90 (0x7fef04d98d90 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #29: <unknown function> + 0x164cb4 (0x7fef04d99cb4 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #30: _PyEval_EvalFrameDefault + 0x4186 (0x7fef04d9e4a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #31: <unknown function> + 0x1646fe (0x7fef04d996fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #32: <unknown function> + 0x164a12 (0x7fef04d99a12 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #33: _PyEval_EvalFrameDefault + 0x4186 (0x7fef04d9e4a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #34: <unknown function> + 0x1646fe (0x7fef04d996fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #35: PyEval_EvalCodeEx + 0x6d (0x7fef04d99d2d in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #36: PyEval_EvalCode + 0x3b (0x7fef04d99d7b in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #37: PyRun_FileExFlags + 0xb2 (0x7fef04dd5782 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #38: PyRun_SimpleFileExFlags + 0xe7 (0x7fef04dd58e7 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #39: Py_Main + 0xe9d (0x7fef04df1f1d in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #40: main + 0x16c (0x400b3c in python)
frame #41: __libc_start_main + 0xf5 (0x7fef03f65445 in /lib64/libc.so.6)
frame #42: python() [0x400bfa]

Can I expect to get some help for this error?

t-vi commented 6 years ago

If we are to look into it, we would need a minimal reproducing example. Can you give the exact tensors you passed to the loss function?

Best regards

Thomas

jinserk commented 6 years ago

Thanks for the reply, Thomas! I don't think it is originated from any input tensors, since it occurs randomly in the async cpu-gpu copying setting. I found that another exceptions happend before the error above as:

THCudaCheck FAIL file=/home/jbaik/setup/pytorch/pytorch/aten/src/THC/generic/THCTensorCopy.cpp line=70 error=77 : an illegal memory access was encountered
Traceback (most recent call last):
  File "/d1/jbaik/ics-asr/asr/densenet_ctc/model.py", line 78, in train_epoch
    loss = self.loss(ys_hat, frame_lens, ys, label_lens)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/nn/modules/module.py", line 468, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc-0.0.0-py3.6-linux-x86_64.egg/warpctc/__init__.py", line 102, in forward
    torch.is_grad_enabled())
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc-0.0.0-py3.6-linux-x86_64.egg/warpctc/__init__.py", line 22, in forward
    want_gradient)
RuntimeError: cuda memcpy or memset failed (ctc at src/_warpctc.cpp:106)
frame #0: <unknown function> + 0x13a8a (0x7f39a7b8ba8a in /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc-0.0.0-py3.6-linux-x86_64.egg/warpctc/_warpctc.cpython-36m-x86_64-linux-gnu.so)
frame #1: <unknown function> + 0x11a61 (0x7f39a7b89a61 in /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc-0.0.0-py3.6-linux-x86_64.egg/warpctc/_warpctc.cpython-36m-x86_64-linux-gnu.so)
frame #2: _PyCFunction_FastCallDict + 0x22d (0x7f3a5c6b38ad in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #3: <unknown function> + 0x164afa (0x7f3a5c74cafa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #4: _PyEval_EvalFrameDefault + 0x4186 (0x7f3a5c7514a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #5: <unknown function> + 0x1646fe (0x7f3a5c74c6fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #6: PyEval_EvalCodeEx + 0x6d (0x7f3a5c74cd2d in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #7: <unknown function> + 0xa4c96 (0x7f3a5c68cc96 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #8: PyObject_Call + 0x6a (0x7f3a5c65a8aa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #9: THPFunction_apply(_object*, _object*) + 0x3e8 (0x7f39e903fcf8 in /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #10: _PyCFunction_FastCallDict + 0x18e (0x7f3a5c6b380e in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #11: <unknown function> + 0x164afa (0x7f3a5c74cafa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #12: _PyEval_EvalFrameDefault + 0x4186 (0x7f3a5c7514a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #13: <unknown function> + 0x163d90 (0x7f3a5c74bd90 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #14: _PyFunction_FastCallDict + 0x2c6 (0x7f3a5c7554a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #15: _PyObject_FastCallDict + 0x17e (0x7f3a5c65aafe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #16: _PyObject_Call_Prepend + 0xce (0x7f3a5c65abee in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #17: PyObject_Call + 0x6a (0x7f3a5c65a8aa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #18: _PyEval_EvalFrameDefault + 0x3adf (0x7f3a5c750dff in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #19: <unknown function> + 0x1646fe (0x7f3a5c74c6fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #20: _PyFunction_FastCallDict + 0x165 (0x7f3a5c755345 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #21: _PyObject_FastCallDict + 0x17e (0x7f3a5c65aafe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #22: _PyObject_Call_Prepend + 0xce (0x7f3a5c65abee in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #23: PyObject_Call + 0x6a (0x7f3a5c65a8aa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #24: <unknown function> + 0xe8701 (0x7f3a5c6d0701 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #25: _PyObject_FastCallDict + 0x8b (0x7f3a5c65aa0b in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #26: <unknown function> + 0x164868 (0x7f3a5c74c868 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #27: _PyEval_EvalFrameDefault + 0x4186 (0x7f3a5c7514a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #28: <unknown function> + 0x163d90 (0x7f3a5c74bd90 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #29: <unknown function> + 0x164cb4 (0x7f3a5c74ccb4 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #30: _PyEval_EvalFrameDefault + 0x4186 (0x7f3a5c7514a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #31: <unknown function> + 0x1646fe (0x7f3a5c74c6fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #32: <unknown function> + 0x164a12 (0x7f3a5c74ca12 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #33: _PyEval_EvalFrameDefault + 0x4186 (0x7f3a5c7514a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #34: <unknown function> + 0x1646fe (0x7f3a5c74c6fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #35: PyEval_EvalCodeEx + 0x6d (0x7f3a5c74cd2d in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #36: PyEval_EvalCode + 0x3b (0x7f3a5c74cd7b in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #37: PyRun_FileExFlags + 0xb2 (0x7f3a5c788782 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #38: PyRun_SimpleFileExFlags + 0xe7 (0x7f3a5c7888e7 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #39: Py_Main + 0xe9d (0x7f3a5c7a4f1d in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #40: main + 0x16c (0x400b3c in python)
frame #41: __libc_start_main + 0xf5 (0x7f3a5b918445 in /lib64/libc.so.6)
frame #42: python() [0x400bfa]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 24, in <module>
    densenet_ctc.train(argv)
  File "/d1/jbaik/ics-asr/asr/densenet_ctc/train.py", line 84, in train
    model.train_epoch(data_loaders["train"])
  File "/d1/jbaik/ics-asr/asr/densenet_ctc/model.py", line 83, in train_epoch
    print(ys_hat, frame_lens, ys, label_lens)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/tensor.py", line 57, in __repr__
    return torch._tensor_str._str(self)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/_tensor_str.py", line 253, in _str
    fmt, scale, sz = _number_format(get_summarized_data(self) if summarize else self)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/_tensor_str.py", line 83, in _number_format
    tensor = torch.DoubleTensor(tensor.size()).copy_(tensor).abs_().view(tensor.nelement())
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /home/jbaik/setup/pytorch/pytorch/aten/src/THC/generic/THCTensorCopy.cpp:70

I've printed the input tensors to catch the moment the error occurs, but it looks like the printing tensors give the runtime some time margin for the error not to happen. Of course, I'm not sure..

jinserk commented 6 years ago

Hi Thomas, Please check the following test code:

import tqdm
import random
import torch
import warpctc

ctc_loss = warpctc.CTCLoss()

if torch.cuda.is_available():
    device = torch.device("cuda")
else:
    device = torch.device("cpu")

def small_test(cuda=False):
    # alphabet_size = 5
    batch = 2
    num_labels = 200
    input_lengths = torch.randint(1000, 1500, (batch,), dtype=torch.int)
    max_length = max(input_lengths)
    activations = torch.zeros((max_length, batch, num_labels))
    for i in range(batch):
        length = input_lengths[i]
        activations.narrow(1, i, 1).narrow(0, 0, length).copy_(-torch.rand((length, 1, num_labels)))
    if cuda:
        device = torch.device("cuda")
        activations = activations.to(device, non_blocking=True)
    label_lengths = torch.randint(500, 1000, (batch,), dtype=torch.int)
    length = sum(label_lengths)
    labels = torch.randint(0, num_labels, (length,), dtype=torch.int)

    print(activations, input_lengths, labels, label_lengths)
    loss = ctc_loss(activations, input_lengths, labels, label_lengths)

if __name__ == '__main__':
    for i in tqdm.tqdm(range(10000)):
        small_test(True)

this code produces a similar error. I made some similar dimensions of each input tensors as what I dealt with as I can, and an error happens but it is "unknown", instead of "cuda memcpy or memset failed"

Traceback (most recent call last):
  File "test_ctc.py", line 35, in <module>
    small_test(True)
  File "test_ctc.py", line 30, in small_test
    loss = ctc_loss(activations, input_lengths, labels, label_lengths)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/nn/modules/module.py", line 468, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc-0.0.0-py3.6-linux-x86_64.egg/warpctc/__init__.py", line 102, in forward
    torch.is_grad_enabled())
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc-0.0.0-py3.6-linux-x86_64.egg/warpctc/__init__.py", line 22, in forward
    want_gradient)
RuntimeError: unknown error (ctc at src/_warpctc.cpp:106)
frame #0: <unknown function> + 0x13a8a (0x7fb5f550fa8a in /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc-0.0.0-py3.6-linux-x86_64.egg/warpctc/_warpctc.cpython-36m-x86_64-linux-gnu.so)
frame #1: <unknown function> + 0x11a61 (0x7fb5f550da61 in /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc-0.0.0-py3.6-linux-x86_64.egg/warpctc/_warpctc.cpython-36m-x86_64-linux-gnu.so)
frame #2: _PyCFunction_FastCallDict + 0x22d (0x7fb6a4e868ad in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #3: <unknown function> + 0x164afa (0x7fb6a4f1fafa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #4: _PyEval_EvalFrameDefault + 0x4186 (0x7fb6a4f244a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #5: <unknown function> + 0x1646fe (0x7fb6a4f1f6fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #6: PyEval_EvalCodeEx + 0x6d (0x7fb6a4f1fd2d in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #7: <unknown function> + 0xa4c96 (0x7fb6a4e5fc96 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #8: PyObject_Call + 0x6a (0x7fb6a4e2d8aa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #9: THPFunction_apply(_object*, _object*) + 0x3e8 (0x7fb62fb90cf8 in /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #10: _PyCFunction_FastCallDict + 0x18e (0x7fb6a4e8680e in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #11: <unknown function> + 0x164afa (0x7fb6a4f1fafa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #12: _PyEval_EvalFrameDefault + 0x4186 (0x7fb6a4f244a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #13: <unknown function> + 0x163d90 (0x7fb6a4f1ed90 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #14: _PyFunction_FastCallDict + 0x2c6 (0x7fb6a4f284a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #15: _PyObject_FastCallDict + 0x17e (0x7fb6a4e2dafe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #16: _PyObject_Call_Prepend + 0xce (0x7fb6a4e2dbee in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #17: PyObject_Call + 0x6a (0x7fb6a4e2d8aa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #18: _PyEval_EvalFrameDefault + 0x3adf (0x7fb6a4f23dff in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #19: <unknown function> + 0x1646fe (0x7fb6a4f1f6fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #20: _PyFunction_FastCallDict + 0x165 (0x7fb6a4f28345 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #21: _PyObject_FastCallDict + 0x17e (0x7fb6a4e2dafe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #22: _PyObject_Call_Prepend + 0xce (0x7fb6a4e2dbee in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #23: PyObject_Call + 0x6a (0x7fb6a4e2d8aa in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #24: <unknown function> + 0xe8701 (0x7fb6a4ea3701 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #25: _PyObject_FastCallDict + 0x8b (0x7fb6a4e2da0b in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #26: <unknown function> + 0x164868 (0x7fb6a4f1f868 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #27: _PyEval_EvalFrameDefault + 0x4186 (0x7fb6a4f244a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #28: <unknown function> + 0x1646fe (0x7fb6a4f1f6fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #29: <unknown function> + 0x164a12 (0x7fb6a4f1fa12 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #30: _PyEval_EvalFrameDefault + 0x4186 (0x7fb6a4f244a6 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #31: <unknown function> + 0x1646fe (0x7fb6a4f1f6fe in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #32: PyEval_EvalCodeEx + 0x6d (0x7fb6a4f1fd2d in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #33: PyEval_EvalCode + 0x3b (0x7fb6a4f1fd7b in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #34: PyRun_FileExFlags + 0xb2 (0x7fb6a4f5b782 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #35: PyRun_SimpleFileExFlags + 0xe7 (0x7fb6a4f5b8e7 in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #36: Py_Main + 0xe9d (0x7fb6a4f77f1d in /home/jbaik/.pyenv/versions/3.6.5/lib/libpython3.6m.so.1.0)
frame #37: main + 0x16c (0x400b3c in python)
frame #38: __libc_start_main + 0xf5 (0x7fb6a40eb445 in /lib64/libc.so.6)
frame #39: python() [0x400bfa]

t-vi commented 6 years ago

Thanks for the script!

The "unknown error" is the ctc return code, so the wrapper handling is OK. The thing that is unfortunate is that there is a C++ stack trace to confuse the users. The C++ extension misses PyTorch's magical handling of that, so we need to do this in the wrapper. I pushed a fix for that.
The unknown error appears to show when the input sizes get larger than ~600. I'm not sure whether that is a known limitation of warp_ctc (could be, if only for numerical precision).
When I shorten the inputs to be 500-600, I haven't gotten the memcpy error, either.

So I'm afraid the repro script does produce an error due to input size, but I'm not sure we know much beyond that... :(

jinserk commented 6 years ago

Hmm. I'm trying to use your warp-ctc binding to my speech recognition project. The acoustic model I'm using generates various frame length of CTC input (T in TxNxH of activations, N is batch size and H is the number of labels). T is typically 100~2000 and H is ~200. Therefore it is often larger than 600, obviously. As I know, the original Baidu's warp-ctc was developed as a part of ASR, so I think they also knew the typical dimensions.. It's weird that the limit is only 500-600, from my humble guess..

t-vi commented 6 years ago

Yeah, it's broken and needs fixing (CPU works, too).

jinserk commented 6 years ago

I found that the origin of the error is not the activation tensor, but the label tensor. if the label tensor is too long, the error seems to happen. It's really odd.

jinserk commented 6 years ago

I'm printing out the input tensors to ctc_loss but it looks like the error isn't originated from the size:

torch.Size([1106, 2, 187]) tensor([ 279, 1106], dtype=torch.int32) torch.Size([176]) tensor([ 30, 146], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████                                                                                                                                       | 1352/5000 [02:57<07:59,  7.60it/s]torch.Size([867, 2, 187]) tensor([867, 267], dtype=torch.int32) torch.Size([127]) tensor([112,  15], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████                                                                                                                                       | 1353/5000 [02:58<07:59,  7.60it/s]torch.Size([613, 2, 187]) tensor([300, 613], dtype=torch.int32) torch.Size([129]) tensor([32, 97], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████                                                                                                                                       | 1354/5000 [02:58<07:59,  7.60it/s]torch.Size([445, 2, 187]) tensor([445, 194], dtype=torch.int32) torch.Size([86]) tensor([81,  5], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▏                                                                                                                                      | 1355/5000 [02:58<07:59,  7.60it/s]torch.Size([348, 2, 187]) tensor([167, 348], dtype=torch.int32) torch.Size([32]) tensor([ 5, 27], dtype=torch.int32)
torch.Size([795, 2, 187]) tensor([511, 795], dtype=torch.int32) torch.Size([137]) tensor([53, 84], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▏                                                                                                                                      | 1357/5000 [02:58<07:59,  7.60it/s]torch.Size([968, 2, 187]) tensor([968, 390], dtype=torch.int32) torch.Size([197]) tensor([153,  44], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▏                                                                                                                                      | 1358/5000 [02:58<07:59,  7.60it/s]torch.Size([1065, 2, 187]) tensor([ 229, 1065], dtype=torch.int32) torch.Size([160]) tensor([  5, 155], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▎                                                                                                                                      | 1359/5000 [02:58<07:59,  7.60it/s]torch.Size([432, 2, 187]) tensor([432, 183], dtype=torch.int32) torch.Size([63]) tensor([44, 19], dtype=torch.int32)
torch.Size([560, 2, 187]) tensor([357, 560], dtype=torch.int32) torch.Size([103]) tensor([47, 56], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▎                                                                                                                                      | 1361/5000 [02:59<07:58,  7.60it/s]torch.Size([969, 2, 187]) tensor([722, 969], dtype=torch.int32) torch.Size([203]) tensor([ 74, 129], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▍                                                                                                                                      | 1362/5000 [02:59<07:58,  7.60it/s]torch.Size([159, 2, 187]) tensor([159, 147], dtype=torch.int32) torch.Size([11]) tensor([3, 8], dtype=torch.int32)
torch.Size([682, 2, 187]) tensor([174, 682], dtype=torch.int32) torch.Size([80]) tensor([ 5, 75], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▍                                                                                                                                      | 1364/5000 [02:59<07:58,  7.60it/s]torch.Size([799, 2, 187]) tensor([799, 224], dtype=torch.int32) torch.Size([96]) tensor([82, 14], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▌                                                                                                                                      | 1365/5000 [02:59<07:58,  7.60it/s]torch.Size([916, 2, 187]) tensor([197, 916], dtype=torch.int32) torch.Size([99]) tensor([31, 68], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▌                                                                                                                                      | 1366/5000 [02:59<07:58,  7.60it/s]torch.Size([1028, 2, 187]) tensor([1028,  164], dtype=torch.int32) torch.Size([110]) tensor([108,   2], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▌                                                                                                                                      | 1367/5000 [03:00<07:58,  7.59it/s]torch.Size([1074, 2, 187]) tensor([ 361, 1074], dtype=torch.int32) torch.Size([225]) tensor([ 55, 170], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▌                                                                                                                                      | 1368/5000 [03:00<07:58,  7.59it/s]torch.Size([1447, 2, 187]) tensor([1447,  882], dtype=torch.int32) torch.Size([339]) tensor([212, 127], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▋                                                                                                                                      | 1369/5000 [03:00<07:58,  7.58it/s]torch.Size([806, 2, 187]) tensor([667, 806], dtype=torch.int32) torch.Size([169]) tensor([81, 88], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▋                                                                                                                                      | 1370/5000 [03:00<07:58,  7.58it/s]torch.Size([280, 2, 187]) tensor([280, 171], dtype=torch.int32) torch.Size([19]) tensor([17,  2], dtype=torch.int32)
torch.Size([1324, 2, 187]) tensor([1324,  683], dtype=torch.int32) torch.Size([282]) tensor([182, 100], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▊                                                                                                                                      | 1372/5000 [03:01<07:58,  7.58it/s]torch.Size([375, 2, 187]) tensor([375, 123], dtype=torch.int32) torch.Size([56]) tensor([48,  8], dtype=torch.int32)
torch.Size([487, 2, 187]) tensor([349, 487], dtype=torch.int32) torch.Size([91]) tensor([39, 52], dtype=torch.int32)
training  :  27%|██████████████████████████████████████████████████▊                                                                                                                                      | 1374/5000 [03:01<07:58,  7.58it/s]torch.Size([214, 2, 187]) tensor([124, 214], dtype=torch.int32) torch.Size([31]) tensor([ 2, 29], dtype=torch.int32)
torch.Size([1512, 2, 187]) tensor([1512,  246], dtype=torch.int32) torch.Size([259]) tensor([227,  32], dtype=torch.int32)
training  :  28%|██████████████████████████████████████████████████▉                                                                                                                                      | 1376/5000 [03:01<07:58,  7.58it/s]torch.Size([362, 2, 187]) tensor([362, 121], dtype=torch.int32) torch.Size([52]) tensor([49,  3], dtype=torch.int32)
torch.Size([181, 2, 187]) tensor([181, 156], dtype=torch.int32) torch.Size([23]) tensor([17,  6], dtype=torch.int32)
training  :  28%|██████████████████████████████████████████████████▉                                                                                                                                      | 1378/5000 [03:01<07:57,  7.58it/s]torch.Size([264, 2, 187]) tensor([264, 151], dtype=torch.int32) torch.Size([31]) tensor([26,  5], dtype=torch.int32)
torch.Size([1184, 2, 187]) tensor([ 189, 1184], dtype=torch.int32) torch.Size([173]) tensor([ 16, 157], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████                                                                                                                                      | 1380/5000 [03:02<07:57,  7.58it/s]torch.Size([154, 2, 187]) tensor([143, 154], dtype=torch.int32) torch.Size([17]) tensor([ 2, 15], dtype=torch.int32)
torch.Size([334, 2, 187]) tensor([285, 334], dtype=torch.int32) torch.Size([75]) tensor([19, 56], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▏                                                                                                                                     | 1382/5000 [03:02<07:57,  7.58it/s]torch.Size([498, 2, 187]) tensor([132, 498], dtype=torch.int32) torch.Size([67]) tensor([ 8, 59], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▏                                                                                                                                     | 1383/5000 [03:02<07:56,  7.58it/s]torch.Size([203, 2, 187]) tensor([125, 203], dtype=torch.int32) torch.Size([10]) tensor([5, 5], dtype=torch.int32)
torch.Size([531, 2, 187]) tensor([531, 123], dtype=torch.int32) torch.Size([63]) tensor([58,  5], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▏                                                                                                                                     | 1385/5000 [03:02<07:56,  7.59it/s]torch.Size([685, 2, 187]) tensor([201, 685], dtype=torch.int32) torch.Size([75]) tensor([11, 64], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▎                                                                                                                                     | 1386/5000 [03:02<07:56,  7.59it/s]torch.Size([364, 2, 187]) tensor([225, 364], dtype=torch.int32) torch.Size([69]) tensor([21, 48], dtype=torch.int32)
torch.Size([376, 2, 187]) tensor([376, 253], dtype=torch.int32) torch.Size([62]) tensor([36, 26], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▎                                                                                                                                     | 1388/5000 [03:02<07:55,  7.59it/s]torch.Size([946, 2, 187]) tensor([339, 946], dtype=torch.int32) torch.Size([161]) tensor([ 29, 132], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▍                                                                                                                                     | 1389/5000 [03:03<07:55,  7.59it/s]torch.Size([381, 2, 187]) tensor([211, 381], dtype=torch.int32) torch.Size([89]) tensor([21, 68], dtype=torch.int32)
torch.Size([221, 2, 187]) tensor([221, 158], dtype=torch.int32) torch.Size([26]) tensor([21,  5], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▍                                                                                                                                     | 1391/5000 [03:03<07:55,  7.59it/s]torch.Size([554, 2, 187]) tensor([299, 554], dtype=torch.int32) torch.Size([136]) tensor([ 31, 105], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▌                                                                                                                                     | 1392/5000 [03:03<07:55,  7.59it/s]torch.Size([998, 2, 187]) tensor([447, 998], dtype=torch.int32) torch.Size([187]) tensor([ 48, 139], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▌                                                                                                                                     | 1393/5000 [03:03<07:55,  7.59it/s]torch.Size([466, 2, 187]) tensor([466, 376], dtype=torch.int32) torch.Size([84]) tensor([36, 48], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▌                                                                                                                                     | 1394/5000 [03:03<07:55,  7.59it/s]torch.Size([436, 2, 187]) tensor([436, 170], dtype=torch.int32) torch.Size([66]) tensor([64,  2], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▌                                                                                                                                     | 1395/5000 [03:03<07:54,  7.59it/s]torch.Size([752, 2, 187]) tensor([174, 752], dtype=torch.int32) torch.Size([121]) tensor([ 17, 104], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▋                                                                                                                                     | 1396/5000 [03:03<07:54,  7.59it/s]torch.Size([191, 2, 187]) tensor([191, 178], dtype=torch.int32) torch.Size([19]) tensor([ 9, 10], dtype=torch.int32)
torch.Size([825, 2, 187]) tensor([137, 825], dtype=torch.int32) torch.Size([122]) tensor([  8, 114], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▋                                                                                                                                     | 1398/5000 [03:04<07:54,  7.59it/s]torch.Size([185, 2, 187]) tensor([185, 142], dtype=torch.int32) torch.Size([22]) tensor([13,  9], dtype=torch.int32)
torch.Size([499, 2, 187]) tensor([387, 499], dtype=torch.int32) torch.Size([115]) tensor([63, 52], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▊                                                                                                                                     | 1400/5000 [03:04<07:54,  7.59it/s]torch.Size([794, 2, 187]) tensor([794, 482], dtype=torch.int32) torch.Size([151]) tensor([104,  47], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▊                                                                                                                                     | 1401/5000 [03:04<07:53,  7.59it/s]torch.Size([648, 2, 187]) tensor([648, 407], dtype=torch.int32) torch.Size([129]) tensor([86, 43], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▊                                                                                                                                     | 1402/5000 [03:04<07:53,  7.59it/s]torch.Size([1267, 2, 187]) tensor([1267,  173], dtype=torch.int32) torch.Size([250]) tensor([239,  11], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▉                                                                                                                                     | 1403/5000 [03:04<07:54,  7.59it/s]torch.Size([297, 2, 187]) tensor([153, 297], dtype=torch.int32) torch.Size([37]) tensor([ 5, 32], dtype=torch.int32)
torch.Size([562, 2, 187]) tensor([562, 353], dtype=torch.int32) torch.Size([99]) tensor([64, 35], dtype=torch.int32)
training  :  28%|███████████████████████████████████████████████████▉                                                                                                                                     | 1405/5000 [03:05<07:53,  7.59it/s]torch.Size([737, 2, 187]) tensor([358, 737], dtype=torch.int32) torch.Size([94]) tensor([28, 66], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████                                                                                                                                     | 1406/5000 [03:05<07:53,  7.59it/s]torch.Size([600, 2, 187]) tensor([482, 600], dtype=torch.int32) torch.Size([159]) tensor([63, 96], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████                                                                                                                                     | 1407/5000 [03:05<07:53,  7.59it/s]torch.Size([696, 2, 187]) tensor([696, 513], dtype=torch.int32) torch.Size([169]) tensor([115,  54], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████                                                                                                                                     | 1408/5000 [03:05<07:53,  7.59it/s]torch.Size([448, 2, 187]) tensor([166, 448], dtype=torch.int32) torch.Size([75]) tensor([16, 59], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████▏                                                                                                                                    | 1409/5000 [03:05<07:52,  7.59it/s]torch.Size([199, 2, 187]) tensor([158, 199], dtype=torch.int32) torch.Size([14]) tensor([9, 5], dtype=torch.int32)
torch.Size([737, 2, 187]) tensor([737, 441], dtype=torch.int32) torch.Size([108]) tensor([62, 46], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████▏                                                                                                                                    | 1411/5000 [03:05<07:52,  7.59it/s]torch.Size([444, 2, 187]) tensor([289, 444], dtype=torch.int32) torch.Size([102]) tensor([19, 83], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████▏                                                                                                                                    | 1412/5000 [03:05<07:52,  7.60it/s]torch.Size([412, 2, 187]) tensor([171, 412], dtype=torch.int32) torch.Size([52]) tensor([ 5, 47], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████▎                                                                                                                                    | 1413/5000 [03:06<07:52,  7.60it/s]torch.Size([760, 2, 187]) tensor([760, 563], dtype=torch.int32) torch.Size([140]) tensor([76, 64], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████▎                                                                                                                                    | 1414/5000 [03:06<07:52,  7.60it/s]torch.Size([499, 2, 187]) tensor([499, 211], dtype=torch.int32) torch.Size([91]) tensor([68, 23], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████▎                                                                                                                                    | 1415/5000 [03:06<07:51,  7.60it/s]torch.Size([216, 2, 187]) tensor([174, 216], dtype=torch.int32) torch.Size([51]) tensor([28, 23], dtype=torch.int32)
torch.Size([1083, 2, 187]) tensor([1083,  210], dtype=torch.int32) torch.Size([190]) tensor([174,  16], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████▍                                                                                                                                    | 1417/5000 [03:06<07:51,  7.60it/s]torch.Size([760, 2, 187]) tensor([760, 433], dtype=torch.int32) torch.Size([166]) tensor([88, 78], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████▍                                                                                                                                    | 1418/5000 [03:06<07:51,  7.59it/s]torch.Size([316, 2, 187]) tensor([209, 316], dtype=torch.int32) torch.Size([43]) tensor([28, 15], dtype=torch.int32)
torch.Size([845, 2, 187]) tensor([350, 845], dtype=torch.int32) torch.Size([150]) tensor([ 45, 105], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████▌                                                                                                                                    | 1420/5000 [03:06<07:51,  7.60it/s]torch.Size([309, 2, 187]) tensor([161, 309], dtype=torch.int32) torch.Size([39]) tensor([ 9, 30], dtype=torch.int32)
torch.Size([859, 2, 187]) tensor([859, 185], dtype=torch.int32) torch.Size([119]) tensor([113,   6], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████▌                                                                                                                                    | 1422/5000 [03:07<07:51,  7.60it/s]torch.Size([298, 2, 187]) tensor([281, 298], dtype=torch.int32) torch.Size([82]) tensor([41, 41], dtype=torch.int32)
torch.Size([589, 2, 187]) tensor([325, 589], dtype=torch.int32) torch.Size([120]) tensor([45, 75], dtype=torch.int32)
training  :  28%|████████████████████████████████████████████████████▋                                                                                                                                    | 1424/5000 [03:07<07:50,  7.60it/s]torch.Size([194, 2, 187]) tensor([138, 194], dtype=torch.int32) torch.Size([28]) tensor([ 9, 19], dtype=torch.int32)
torch.Size([560, 2, 187]) tensor([177, 560], dtype=torch.int32) torch.Size([62]) tensor([14, 48], dtype=torch.int32)
training  :  29%|████████████████████████████████████████████████████▊                                                                                                                                    | 1426/5000 [03:07<07:50,  7.60it/s]torch.Size([722, 2, 187]) tensor([685, 722], dtype=torch.int32) torch.Size([212]) tensor([ 98, 114], dtype=torch.int32)
training  :  29%|████████████████████████████████████████████████████▊                                                                                                                                    | 1427/5000 [03:07<07:50,  7.60it/s]torch.Size([927, 2, 187]) tensor([244, 927], dtype=torch.int32) torch.Size([131]) tensor([ 28, 103], dtype=torch.int32)
training  :  29%|████████████████████████████████████████████████████▊                                                                                                                                    | 1428/5000 [03:07<07:50,  7.60it/s]torch.Size([195, 2, 187]) tensor([195, 147], dtype=torch.int32) torch.Size([13]) tensor([4, 9], dtype=torch.int32)
torch.Size([1043, 2, 187]) tensor([1043,  296], dtype=torch.int32) torch.Size([179]) tensor([163,  16], dtype=torch.int32)
training  :  29%|████████████████████████████████████████████████████▉                                                                                                                                    | 1430/5000 [03:08<07:49,  7.60it/s]torch.Size([369, 2, 187]) tensor([172, 369], dtype=torch.int32) torch.Size([41]) tensor([10, 31], dtype=torch.int32)
torch.Size([927, 2, 187]) tensor([130, 927], dtype=torch.int32) torch.Size([111]) tensor([  3, 108], dtype=torch.int32)
training  :  29%|████████████████████████████████████████████████████▉                                                                                                                                    | 1432/5000 [03:08<07:49,  7.60it/s]torch.Size([638, 2, 187]) tensor([638, 608], dtype=torch.int32) torch.Size([174]) tensor([93, 81], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████                                                                                                                                    | 1433/5000 [03:08<07:49,  7.60it/s]torch.Size([224, 2, 187]) tensor([163, 224], dtype=torch.int32) torch.Size([43]) tensor([21, 22], dtype=torch.int32)
torch.Size([1228, 2, 187]) tensor([1228,  338], dtype=torch.int32) torch.Size([178]) tensor([152,  26], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████                                                                                                                                    | 1435/5000 [03:08<07:49,  7.60it/s]torch.Size([1317, 2, 187]) tensor([1317,  404], dtype=torch.int32) torch.Size([176]) tensor([167,   9], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▏                                                                                                                                   | 1436/5000 [03:09<07:49,  7.59it/s]torch.Size([735, 2, 187]) tensor([735, 136], dtype=torch.int32) torch.Size([84]) tensor([80,  4], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▏                                                                                                                                   | 1437/5000 [03:09<07:49,  7.59it/s]torch.Size([640, 2, 187]) tensor([143, 640], dtype=torch.int32) torch.Size([61]) tensor([ 4, 57], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▏                                                                                                                                   | 1438/5000 [03:09<07:49,  7.59it/s]torch.Size([254, 2, 187]) tensor([254, 147], dtype=torch.int32) torch.Size([26]) tensor([21,  5], dtype=torch.int32)
torch.Size([226, 2, 187]) tensor([224, 226], dtype=torch.int32) torch.Size([34]) tensor([ 6, 28], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▎                                                                                                                                   | 1440/5000 [03:09<07:48,  7.60it/s]torch.Size([943, 2, 187]) tensor([683, 943], dtype=torch.int32) torch.Size([193]) tensor([99, 94], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▎                                                                                                                                   | 1441/5000 [03:09<07:48,  7.59it/s]torch.Size([201, 2, 187]) tensor([153, 201], dtype=torch.int32) torch.Size([29]) tensor([ 3, 26], dtype=torch.int32)
torch.Size([682, 2, 187]) tensor([446, 682], dtype=torch.int32) torch.Size([160]) tensor([ 51, 109], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▍                                                                                                                                   | 1443/5000 [03:09<07:48,  7.60it/s]torch.Size([493, 2, 187]) tensor([281, 493], dtype=torch.int32) torch.Size([99]) tensor([33, 66], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▍                                                                                                                                   | 1444/5000 [03:10<07:48,  7.60it/s]torch.Size([478, 2, 187]) tensor([478, 184], dtype=torch.int32) torch.Size([54]) tensor([35, 19], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▍                                                                                                                                   | 1445/5000 [03:10<07:47,  7.60it/s]torch.Size([293, 2, 187]) tensor([293, 166], dtype=torch.int32) torch.Size([57]) tensor([41, 16], dtype=torch.int32)
torch.Size([448, 2, 187]) tensor([297, 448], dtype=torch.int32) torch.Size([74]) tensor([26, 48], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▌                                                                                                                                   | 1447/5000 [03:10<07:47,  7.60it/s]torch.Size([697, 2, 187]) tensor([478, 697], dtype=torch.int32) torch.Size([182]) tensor([ 73, 109], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▌                                                                                                                                   | 1448/5000 [03:10<07:47,  7.60it/s]torch.Size([545, 2, 187]) tensor([545, 437], dtype=torch.int32) torch.Size([123]) tensor([64, 59], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▌                                                                                                                                   | 1449/5000 [03:10<07:47,  7.60it/s]torch.Size([969, 2, 187]) tensor([229, 969], dtype=torch.int32) torch.Size([150]) tensor([  8, 142], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▋                                                                                                                                   | 1450/5000 [03:10<07:47,  7.60it/s]torch.Size([1283, 2, 187]) tensor([ 406, 1283], dtype=torch.int32) torch.Size([219]) tensor([ 48, 171], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▋                                                                                                                                   | 1451/5000 [03:11<07:47,  7.59it/s]torch.Size([838, 2, 187]) tensor([838, 252], dtype=torch.int32) torch.Size([188]) tensor([159,  29], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▋                                                                                                                                   | 1452/5000 [03:11<07:47,  7.59it/s]torch.Size([365, 2, 187]) tensor([365, 306], dtype=torch.int32) torch.Size([106]) tensor([67, 39], dtype=torch.int32)
torch.Size([854, 2, 187]) tensor([854, 248], dtype=torch.int32) torch.Size([171]) tensor([129,  42], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▊                                                                                                                                   | 1454/5000 [03:11<07:47,  7.59it/s]torch.Size([951, 2, 187]) tensor([712, 951], dtype=torch.int32) torch.Size([193]) tensor([103,  90], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▊                                                                                                                                   | 1455/5000 [03:11<07:47,  7.59it/s]torch.Size([473, 2, 187]) tensor([473, 175], dtype=torch.int32) torch.Size([28]) tensor([22,  6], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▊                                                                                                                                   | 1456/5000 [03:11<07:46,  7.59it/s]torch.Size([1133, 2, 187]) tensor([ 134, 1133], dtype=torch.int32) torch.Size([149]) tensor([  4, 145], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▉                                                                                                                                   | 1457/5000 [03:11<07:46,  7.59it/s]torch.Size([884, 2, 187]) tensor([884, 561], dtype=torch.int32) torch.Size([190]) tensor([105,  85], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▉                                                                                                                                   | 1458/5000 [03:12<07:46,  7.59it/s]torch.Size([644, 2, 187]) tensor([644, 513], dtype=torch.int32) torch.Size([159]) tensor([95, 64], dtype=torch.int32)
training  :  29%|█████████████████████████████████████████████████████▉                                                                                                                                   | 1459/5000 [03:12<07:46,  7.59it/s]torch.Size([284, 2, 187]) tensor([284, 201], dtype=torch.int32) torch.Size([46]) tensor([28, 18], dtype=torch.int32)
torch.Size([322, 2, 187]) tensor([322, 189], dtype=torch.int32) torch.Size([32]) tensor([30,  2], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████                                                                                                                                   | 1461/5000 [03:12<07:46,  7.59it/s]torch.Size([361, 2, 187]) tensor([212, 361], dtype=torch.int32) torch.Size([77]) tensor([27, 50], dtype=torch.int32)
torch.Size([190, 2, 187]) tensor([173, 190], dtype=torch.int32) torch.Size([29]) tensor([19, 10], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████▏                                                                                                                                  | 1463/5000 [03:12<07:45,  7.59it/s]torch.Size([862, 2, 187]) tensor([206, 862], dtype=torch.int32) torch.Size([158]) tensor([ 22, 136], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████▏                                                                                                                                  | 1464/5000 [03:12<07:45,  7.59it/s]torch.Size([821, 2, 187]) tensor([138, 821], dtype=torch.int32) torch.Size([100]) tensor([ 5, 95], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████▏                                                                                                                                  | 1465/5000 [03:12<07:45,  7.59it/s]torch.Size([626, 2, 187]) tensor([503, 626], dtype=torch.int32) torch.Size([140]) tensor([62, 78], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████▏                                                                                                                                  | 1466/5000 [03:13<07:45,  7.59it/s]torch.Size([802, 2, 187]) tensor([124, 802], dtype=torch.int32) torch.Size([72]) tensor([10, 62], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████▎                                                                                                                                  | 1467/5000 [03:13<07:45,  7.59it/s]torch.Size([421, 2, 187]) tensor([421, 195], dtype=torch.int32) torch.Size([67]) tensor([43, 24], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████▎                                                                                                                                  | 1468/5000 [03:13<07:45,  7.59it/s]torch.Size([319, 2, 187]) tensor([234, 319], dtype=torch.int32) torch.Size([65]) tensor([19, 46], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████▎                                                                                                                                  | 1469/5000 [03:13<07:45,  7.59it/s]torch.Size([432, 2, 187]) tensor([395, 432], dtype=torch.int32) torch.Size([71]) tensor([34, 37], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████▍                                                                                                                                  | 1470/5000 [03:13<07:44,  7.59it/s]torch.Size([211, 2, 187]) tensor([154, 211], dtype=torch.int32) torch.Size([44]) tensor([17, 27], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████▍                                                                                                                                  | 1471/5000 [03:13<07:44,  7.60it/s]torch.Size([208, 2, 187]) tensor([208, 150], dtype=torch.int32) torch.Size([24]) tensor([12, 12], dtype=torch.int32)
torch.Size([817, 2, 187]) tensor([817, 679], dtype=torch.int32) torch.Size([157]) tensor([80, 77], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████▌                                                                                                                                  | 1473/5000 [03:13<07:44,  7.60it/s]torch.Size([486, 2, 187]) tensor([381, 486], dtype=torch.int32) torch.Size([111]) tensor([40, 71], dtype=torch.int32)
training  :  29%|██████████████████████████████████████████████████████▌                                                                                                                                  | 1474/5000 [03:14<07:44,  7.60it/s]torch.Size([389, 2, 187]) tensor([133, 389], dtype=torch.int32) torch.Size([50]) tensor([ 6, 44], dtype=torch.int32)
torch.Size([1439, 2, 187]) tensor([ 238, 1439], dtype=torch.int32) torch.Size([220]) tensor([ 21, 199], dtype=torch.int32)
training  :  30%|██████████████████████████████████████████████████████▌                                                                                                                                  | 1476/5000 [03:14<07:44,  7.59it/s]torch.Size([395, 2, 187]) tensor([312, 395], dtype=torch.int32) torch.Size([74]) tensor([29, 45], dtype=torch.int32)
torch.Size([525, 2, 187]) tensor([291, 525], dtype=torch.int32) torch.Size([80]) tensor([37, 43], dtype=torch.int32)
training  :  30%|██████████████████████████████████████████████████████▋                                                                                                                                  | 1478/5000 [03:14<07:43,  7.59it/s]torch.Size([811, 2, 187]) tensor([811, 501], dtype=torch.int32) torch.Size([186]) tensor([119,  67], dtype=torch.int32)
training  :  30%|██████████████████████████████████████████████████████▋                                                                                                                                  | 1479/5000 [03:14<07:43,  7.59it/s]torch.Size([245, 2, 187]) tensor([245, 211], dtype=torch.int32) torch.Size([41]) tensor([26, 15], dtype=torch.int32)
torch.Size([137, 2, 187]) tensor([137, 137], dtype=torch.int32) torch.Size([18]) tensor([15,  3], dtype=torch.int32)
training  :  30%|██████████████████████████████████████████████████████▊                                                                                                                                  | 1481/5000 [03:14<07:43,  7.60it/s]torch.Size([1518, 2, 187]) tensor([1518,  140], dtype=torch.int32) torch.Size([224]) tensor([221,   3], dtype=torch.int32)
training  :  30%|██████████████████████████████████████████████████████▊                                                                                                                                  | 1482/5000 [03:15<07:43,  7.59it/s]torch.Size([909, 2, 187]) tensor([909, 495], dtype=torch.int32) torch.Size([129]) tensor([87, 42], dtype=torch.int32)
training  :  30%|██████████████████████████████████████████████████████▊                                                                                                                                  | 1483/5000 [03:15<07:43,  7.59it/s]torch.Size([411, 2, 187]) tensor([199, 411], dtype=torch.int32) torch.Size([65]) tensor([15, 50], dtype=torch.int32)
training  :  30%|██████████████████████████████████████████████████████▉                                                                                                                                  | 1484/5000 [03:15<07:43,  7.59it/s]torch.Size([406, 2, 187]) tensor([406, 154], dtype=torch.int32) torch.Size([53]) tensor([37, 16], dtype=torch.int32)
training  :  30%|██████████████████████████████████████████████████████▉                                                                                                                                  | 1485/5000 [03:15<07:43,  7.59it/s]torch.Size([274, 2, 187]) tensor([274, 169], dtype=torch.int32) torch.Size([36]) tensor([20, 16], dtype=torch.int32)
torch.Size([504, 2, 187]) tensor([192, 504], dtype=torch.int32) torch.Size([79]) tensor([ 2, 77], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████                                                                                                                                  | 1487/5000 [03:15<07:42,  7.59it/s]torch.Size([814, 2, 187]) tensor([130, 814], dtype=torch.int32) torch.Size([126]) tensor([ 16, 110], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████                                                                                                                                  | 1488/5000 [03:15<07:42,  7.59it/s]torch.Size([717, 2, 187]) tensor([356, 717], dtype=torch.int32) torch.Size([146]) tensor([50, 96], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████                                                                                                                                  | 1489/5000 [03:16<07:42,  7.59it/s]torch.Size([229, 2, 187]) tensor([229, 225], dtype=torch.int32) torch.Size([56]) tensor([25, 31], dtype=torch.int32)
torch.Size([317, 2, 187]) tensor([317, 238], dtype=torch.int32) torch.Size([44]) tensor([15, 29], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▏                                                                                                                                 | 1491/5000 [03:16<07:41,  7.60it/s]torch.Size([769, 2, 187]) tensor([769, 241], dtype=torch.int32) torch.Size([105]) tensor([67, 38], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▏                                                                                                                                 | 1492/5000 [03:16<07:41,  7.60it/s]torch.Size([1017, 2, 187]) tensor([1017,  545], dtype=torch.int32) torch.Size([209]) tensor([147,  62], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▏                                                                                                                                 | 1493/5000 [03:16<07:41,  7.59it/s]torch.Size([238, 2, 187]) tensor([166, 238], dtype=torch.int32) torch.Size([29]) tensor([ 8, 21], dtype=torch.int32)
torch.Size([535, 2, 187]) tensor([410, 535], dtype=torch.int32) torch.Size([86]) tensor([42, 44], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▎                                                                                                                                 | 1495/5000 [03:16<07:41,  7.60it/s]torch.Size([775, 2, 187]) tensor([775, 319], dtype=torch.int32) torch.Size([151]) tensor([105,  46], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▎                                                                                                                                 | 1496/5000 [03:16<07:41,  7.60it/s]torch.Size([661, 2, 187]) tensor([661, 125], dtype=torch.int32) torch.Size([51]) tensor([48,  3], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▍                                                                                                                                 | 1497/5000 [03:17<07:41,  7.60it/s]torch.Size([175, 2, 187]) tensor([175, 133], dtype=torch.int32) torch.Size([28]) tensor([16, 12], dtype=torch.int32)
torch.Size([561, 2, 187]) tensor([169, 561], dtype=torch.int32) torch.Size([76]) tensor([ 3, 73], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▍                                                                                                                                 | 1499/5000 [03:17<07:40,  7.60it/s]torch.Size([1178, 2, 187]) tensor([1178,  251], dtype=torch.int32) torch.Size([218]) tensor([184,  34], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▌                                                                                                                                 | 1500/5000 [03:17<07:40,  7.59it/s]torch.Size([331, 2, 187]) tensor([331, 326], dtype=torch.int32) torch.Size([71]) tensor([46, 25], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▌                                                                                                                                 | 1501/5000 [03:17<07:40,  7.60it/s]torch.Size([363, 2, 187]) tensor([152, 363], dtype=torch.int32) torch.Size([41]) tensor([ 2, 39], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▌                                                                                                                                 | 1502/5000 [03:17<07:40,  7.60it/s]torch.Size([192, 2, 187]) tensor([153, 192], dtype=torch.int32) torch.Size([7]) tensor([5, 2], dtype=torch.int32)
torch.Size([1082, 2, 187]) tensor([1082,  177], dtype=torch.int32) torch.Size([129]) tensor([114,  15], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▋                                                                                                                                 | 1504/5000 [03:18<07:40,  7.60it/s]torch.Size([1212, 2, 187]) tensor([ 157, 1212], dtype=torch.int32) torch.Size([208]) tensor([ 14, 194], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▋                                                                                                                                 | 1505/5000 [03:18<07:40,  7.59it/s]torch.Size([255, 2, 187]) tensor([255, 231], dtype=torch.int32) torch.Size([32]) tensor([26,  6], dtype=torch.int32)
torch.Size([220, 2, 187]) tensor([140, 220], dtype=torch.int32) torch.Size([31]) tensor([ 8, 23], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▊                                                                                                                                 | 1507/5000 [03:18<07:39,  7.60it/s]torch.Size([1034, 2, 187]) tensor([ 785, 1034], dtype=torch.int32) torch.Size([228]) tensor([117, 111], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▊                                                                                                                                 | 1508/5000 [03:18<07:39,  7.59it/s]torch.Size([865, 2, 187]) tensor([614, 865], dtype=torch.int32) torch.Size([185]) tensor([ 64, 121], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▊                                                                                                                                 | 1509/5000 [03:18<07:39,  7.59it/s]torch.Size([745, 2, 187]) tensor([146, 745], dtype=torch.int32) torch.Size([134]) tensor([  5, 129], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▊                                                                                                                                 | 1510/5000 [03:18<07:39,  7.59it/s]torch.Size([802, 2, 187]) tensor([802, 127], dtype=torch.int32) torch.Size([97]) tensor([94,  3], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▉                                                                                                                                 | 1511/5000 [03:19<07:39,  7.59it/s]torch.Size([193, 2, 187]) tensor([193, 187], dtype=torch.int32) torch.Size([31]) tensor([ 2, 29], dtype=torch.int32)
torch.Size([281, 2, 187]) tensor([211, 281], dtype=torch.int32) torch.Size([52]) tensor([17, 35], dtype=torch.int32)
training  :  30%|███████████████████████████████████████████████████████▉                                                                                                                                 | 1513/5000 [03:19<07:39,  7.59it/s]torch.Size([656, 2, 187]) tensor([611, 656], dtype=torch.int32) torch.Size([121]) tensor([69, 52], dtype=torch.int32)
training  :  30%|████████████████████████████████████████████████████████                                                                                                                                 | 1514/5000 [03:19<07:39,  7.59it/s]torch.Size([218, 2, 187]) tensor([218, 176], dtype=torch.int32) torch.Size([24]) tensor([16,  8], dtype=torch.int32)
torch.Size([894, 2, 187]) tensor([894, 288], dtype=torch.int32) torch.Size([169]) tensor([131,  38], dtype=torch.int32)
training  :  30%|████████████████████████████████████████████████████████                                                                                                                                 | 1516/5000 [03:19<07:38,  7.59it/s]torch.Size([838, 2, 187]) tensor([838, 167], dtype=torch.int32) torch.Size([110]) tensor([105,   5], dtype=torch.int32)
training  :  30%|████████████████████████████████████████████████████████▏                                                                                                                                | 1517/5000 [03:19<07:38,  7.59it/s]torch.Size([455, 2, 187]) tensor([143, 455], dtype=torch.int32) torch.Size([45]) tensor([ 6, 39], dtype=torch.int32)
training  :  30%|████████████████████████████████████████████████████████▏                                                                                                                                | 1518/5000 [03:19<07:38,  7.59it/s]torch.Size([653, 2, 187]) tensor([653, 368], dtype=torch.int32) torch.Size([103]) tensor([64, 39], dtype=torch.int32)
training  :  30%|████████████████████████████████████████████████████████▏                                                                                                                                | 1519/5000 [03:20<07:38,  7.59it/s]torch.Size([284, 2, 187]) tensor([284, 181], dtype=torch.int32) torch.Size([41]) tensor([22, 19], dtype=torch.int32)
torch.Size([140, 2, 187]) tensor([140, 122], dtype=torch.int32) torch.Size([7]) tensor([2, 5], dtype=torch.int32)
training  :  30%|████████████████████████████████████████████████████████▎                                                                                                                                | 1521/5000 [03:20<07:37,  7.60it/s]torch.Size([1004, 2, 187]) tensor([ 172, 1004], dtype=torch.int32) torch.Size([181]) tensor([ 20, 161], dtype=torch.int32)
training  :  30%|████████████████████████████████████████████████████████▎                                                                                                                                | 1522/5000 [03:20<07:37,  7.60it/s]torch.Size([364, 2, 187]) tensor([364, 139], dtype=torch.int32) torch.Size([41]) tensor([33,  8], dtype=torch.int32)
torch.Size([831, 2, 187]) tensor([218, 831], dtype=torch.int32) torch.Size([145]) tensor([ 13, 132], dtype=torch.int32)
training  :  30%|████████████████████████████████████████████████████████▍                                                                                                                                | 1524/5000 [03:20<07:37,  7.60it/s]torch.Size([467, 2, 187]) tensor([467, 224], dtype=torch.int32) torch.Size([78]) tensor([56, 22], dtype=torch.int32)
training  :  30%|████████████████████████████████████████████████████████▍                                                                                                                                | 1525/5000 [03:20<07:37,  7.60it/s]torch.Size([417, 2, 187]) tensor([176, 417], dtype=torch.int32) torch.Size([77]) tensor([12, 65], dtype=torch.int32)
training  :  31%|████████████████████████████████████████████████████████▍                                                                                                                                | 1526/5000 [03:20<07:37,  7.60it/s]torch.Size([340, 2, 187]) tensor([340, 199], dtype=torch.int32) torch.Size([59]) tensor([28, 31], dtype=torch.int32)
torch.Size([1068, 2, 187]) tensor([ 315, 1068], dtype=torch.int32) torch.Size([167]) tensor([ 53, 114], dtype=torch.int32)
training  :  31%|████████████████████████████████████████████████████████▌                                                                                                                                | 1528/5000 [03:21<07:37,  7.60it/s]torch.Size([497, 2, 187]) tensor([295, 497], dtype=torch.int32) torch.Size([101]) tensor([45, 56], dtype=torch.int32)
training  :  31%|████████████████████████████████████████████████████████▌                                                                                                                                | 1529/5000 [03:21<07:36,  7.60it/s]torch.Size([1046, 2, 187]) tensor([1046,  192], dtype=torch.int32) torch.Size([124]) tensor([119,   5], dtype=torch.int32)
training  :  31%|████████████████████████████████████████████████████████▌                                                                                                                                | 1530/5000 [03:21<07:36,  7.60it/s]torch.Size([1393, 2, 187]) tensor([1393, 1231], dtype=torch.int32) torch.Size([308]) tensor([175, 133], dtype=torch.int32)
training  :  31%|████████████████████████████████████████████████████████▋                                                                                                                                | 1531/5000 [03:21<07:37,  7.59it/s]torch.Size([720, 2, 187]) tensor([618, 720], dtype=torch.int32) torch.Size([141]) tensor([59, 82], dtype=torch.int32)
training  :  31%|████████████████████████████████████████████████████████▋                                                                                                                                | 1532/5000 [03:21<07:36,  7.59it/s]torch.Size([413, 2, 187]) tensor([413, 199], dtype=torch.int32) torch.Size([69]) tensor([62,  7], dtype=torch.int32)
training  :  31%|████████████████████████████████████████████████████████▋                                                                                                                                | 1533/5000 [03:21<07:36,  7.59it/s]torch.Size([268, 2, 187]) tensor([146, 268], dtype=torch.int32) torch.Size([43]) tensor([ 7, 36], dtype=torch.int32)
torch.Size([374, 2, 187]) tensor([374, 215], dtype=torch.int32) torch.Size([64]) tensor([45, 19], dtype=torch.int32)
training  :  31%|████████████████████████████████████████████████████████▊                                                                                                                                | 1535/5000 [03:22<07:36,  7.59it/s]torch.Size([359, 2, 187]) tensor([258, 359], dtype=torch.int32) torch.Size([25]) tensor([17,  8], dtype=torch.int32)
torch.Size([432, 2, 187]) tensor([432, 354], dtype=torch.int32) torch.Size([90]) tensor([57, 33], dtype=torch.int32)
training  :  31%|████████████████████████████████████████████████████████▊                                                                                                                                | 1537/5000 [03:22<07:35,  7.60it/s]torch.Size([696, 2, 187]) tensor([696, 205], dtype=torch.int32) torch.Size([152]) tensor([122,  30], dtype=torch.int32)
training  :  31%|████████████████████████████████████████████████████████▉                                                                                                                                | 1538/5000 [03:22<07:35,  7.59it/s]torch.Size([229, 2, 187]) tensor([159, 229], dtype=torch.int32) torch.Size([27]) tensor([ 8, 19], dtype=torch.int32)
torch.Size([747, 2, 187]) tensor([747, 172], dtype=torch.int32) torch.Size([107]) tensor([97, 10], dtype=torch.int32)
training  :  31%|████████████████████████████████████████████████████████▉                                                                                                                                | 1540/5000 [03:22<07:35,  7.60it/s]torch.Size([645, 2, 187]) tensor([645, 166], dtype=torch.int32) torch.Size([90]) tensor([73, 17], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████                                                                                                                                | 1541/5000 [03:22<07:35,  7.60it/s]torch.Size([948, 2, 187]) tensor([353, 948], dtype=torch.int32) torch.Size([169]) tensor([  8, 161], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████                                                                                                                                | 1542/5000 [03:23<07:35,  7.59it/s]torch.Size([480, 2, 187]) tensor([480, 480], dtype=torch.int32) torch.Size([125]) tensor([82, 43], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████                                                                                                                                | 1543/5000 [03:23<07:35,  7.60it/s]torch.Size([860, 2, 187]) tensor([860, 184], dtype=torch.int32) torch.Size([136]) tensor([131,   5], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▏                                                                                                                               | 1544/5000 [03:23<07:35,  7.59it/s]torch.Size([560, 2, 187]) tensor([560, 239], dtype=torch.int32) torch.Size([79]) tensor([67, 12], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▏                                                                                                                               | 1545/5000 [03:23<07:34,  7.59it/s]torch.Size([1180, 2, 187]) tensor([1180,  156], dtype=torch.int32) torch.Size([190]) tensor([175,  15], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▏                                                                                                                               | 1546/5000 [03:23<07:35,  7.59it/s]torch.Size([906, 2, 187]) tensor([906, 220], dtype=torch.int32) torch.Size([169]) tensor([146,  23], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▏                                                                                                                               | 1547/5000 [03:23<07:35,  7.59it/s]torch.Size([204, 2, 187]) tensor([197, 204], dtype=torch.int32) torch.Size([19]) tensor([ 4, 15], dtype=torch.int32)
torch.Size([504, 2, 187]) tensor([179, 504], dtype=torch.int32) torch.Size([100]) tensor([12, 88], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▎                                                                                                                               | 1549/5000 [03:24<07:34,  7.59it/s]torch.Size([1176, 2, 187]) tensor([ 425, 1176], dtype=torch.int32) torch.Size([159]) tensor([ 29, 130], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▎                                                                                                                               | 1550/5000 [03:24<07:34,  7.59it/s]torch.Size([736, 2, 187]) tensor([736, 395], dtype=torch.int32) torch.Size([151]) tensor([104,  47], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▍                                                                                                                               | 1551/5000 [03:24<07:34,  7.59it/s]torch.Size([704, 2, 187]) tensor([704, 285], dtype=torch.int32) torch.Size([146]) tensor([94, 52], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▍                                                                                                                               | 1552/5000 [03:24<07:34,  7.59it/s]torch.Size([136, 2, 187]) tensor([125, 136], dtype=torch.int32) torch.Size([13]) tensor([5, 8], dtype=torch.int32)
torch.Size([293, 2, 187]) tensor([293, 233], dtype=torch.int32) torch.Size([53]) tensor([44,  9], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▍                                                                                                                               | 1554/5000 [03:24<07:33,  7.59it/s]torch.Size([784, 2, 187]) tensor([195, 784], dtype=torch.int32) torch.Size([124]) tensor([  9, 115], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▌                                                                                                                               | 1555/5000 [03:24<07:33,  7.59it/s]torch.Size([364, 2, 187]) tensor([364, 299], dtype=torch.int32) torch.Size([68]) tensor([26, 42], dtype=torch.int32)
torch.Size([278, 2, 187]) tensor([278, 163], dtype=torch.int32) torch.Size([36]) tensor([14, 22], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▌                                                                                                                               | 1557/5000 [03:25<07:33,  7.59it/s]torch.Size([336, 2, 187]) tensor([145, 336], dtype=torch.int32) torch.Size([43]) tensor([ 3, 40], dtype=torch.int32)
torch.Size([329, 2, 187]) tensor([329, 127], dtype=torch.int32) torch.Size([50]) tensor([47,  3], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▋                                                                                                                               | 1559/5000 [03:25<07:32,  7.60it/s]torch.Size([1439, 2, 187]) tensor([1439,  130], dtype=torch.int32) torch.Size([256]) tensor([248,   8], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▋                                                                                                                               | 1560/5000 [03:25<07:33,  7.59it/s]torch.Size([192, 2, 187]) tensor([127, 192], dtype=torch.int32) torch.Size([21]) tensor([ 9, 12], dtype=torch.int32)
torch.Size([1100, 2, 187]) tensor([ 146, 1100], dtype=torch.int32) torch.Size([143]) tensor([ 10, 133], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▊                                                                                                                               | 1562/5000 [03:25<07:33,  7.59it/s]torch.Size([499, 2, 187]) tensor([499, 220], dtype=torch.int32) torch.Size([67]) tensor([44, 23], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▊                                                                                                                               | 1563/5000 [03:25<07:32,  7.59it/s]torch.Size([1077, 2, 187]) tensor([ 250, 1077], dtype=torch.int32) torch.Size([144]) tensor([ 17, 127], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▊                                                                                                                               | 1564/5000 [03:26<07:32,  7.59it/s]torch.Size([690, 2, 187]) tensor([690, 191], dtype=torch.int32) torch.Size([141]) tensor([132,   9], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▉                                                                                                                               | 1565/5000 [03:26<07:32,  7.59it/s]torch.Size([702, 2, 187]) tensor([702, 620], dtype=torch.int32) torch.Size([190]) tensor([118,  72], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▉                                                                                                                               | 1566/5000 [03:26<07:32,  7.59it/s]torch.Size([1419, 2, 187]) tensor([1419,  173], dtype=torch.int32) torch.Size([193]) tensor([178,  15], dtype=torch.int32)
training  :  31%|█████████████████████████████████████████████████████████▉                                                                                                                               | 1567/5000 [03:26<07:32,  7.58it/s]torch.Size([1210, 2, 187]) tensor([1210,  248], dtype=torch.int32) torch.Size([201]) tensor([169,  32], dtype=torch.int32)
training  :  31%|██████████████████████████████████████████████████████████                                                                                                                               | 1568/5000 [03:26<07:32,  7.58it/s]torch.Size([546, 2, 187]) tensor([546, 127], dtype=torch.int32) torch.Size([68]) tensor([66,  2], dtype=torch.int32)
training  :  31%|██████████████████████████████████████████████████████████                                                                                                                               | 1569/5000 [03:27<07:32,  7.58it/s]torch.Size([504, 2, 187]) tensor([275, 504], dtype=torch.int32) torch.Size([105]) tensor([53, 52], dtype=torch.int32)
training  :  31%|██████████████████████████████████████████████████████████                                                                                                                               | 1570/5000 [03:27<07:32,  7.58it/s]torch.Size([455, 2, 187]) tensor([148, 455], dtype=torch.int32) torch.Size([68]) tensor([ 4, 64], dtype=torch.int32)
training  :  31%|██████████████████████████████████████████████████████████▏                                                                                                                              | 1571/5000 [03:27<07:32,  7.58it/s]torch.Size([281, 2, 187]) tensor([281, 122], dtype=torch.int32) torch.Size([22]) tensor([20,  2], dtype=torch.int32)
torch.Size([892, 2, 187]) tensor([892, 393], dtype=torch.int32) torch.Size([145]) tensor([100,  45], dtype=torch.int32)
training  :  31%|██████████████████████████████████████████████████████████▏                                                                                                                              | 1573/5000 [03:27<07:32,  7.58it/s]torch.Size([248, 2, 187]) tensor([248, 123], dtype=torch.int32) torch.Size([25]) tensor([13, 12], dtype=torch.int32)
torch.Size([1206, 2, 187]) tensor([ 569, 1206], dtype=torch.int32) torch.Size([256]) tensor([ 83, 173], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▎                                                                                                                              | 1575/5000 [03:27<07:31,  7.58it/s]torch.Size([562, 2, 187]) tensor([562, 534], dtype=torch.int32) torch.Size([139]) tensor([61, 78], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▎                                                                                                                              | 1576/5000 [03:27<07:31,  7.58it/s]torch.Size([243, 2, 187]) tensor([243, 205], dtype=torch.int32) torch.Size([55]) tensor([30, 25], dtype=torch.int32)
torch.Size([596, 2, 187]) tensor([596, 197], dtype=torch.int32) torch.Size([62]) tensor([60,  2], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▍                                                                                                                              | 1578/5000 [03:28<07:31,  7.58it/s]torch.Size([204, 2, 187]) tensor([204, 154], dtype=torch.int32) torch.Size([15]) tensor([6, 9], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▍                                                                                                                              | 1579/5000 [03:28<07:31,  7.58it/s]torch.Size([346, 2, 187]) tensor([346, 130], dtype=torch.int32) torch.Size([38]) tensor([33,  5], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▍                                                                                                                              | 1580/5000 [03:28<07:30,  7.58it/s]torch.Size([715, 2, 187]) tensor([715, 242], dtype=torch.int32) torch.Size([125]) tensor([103,  22], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▍                                                                                                                              | 1581/5000 [03:28<07:30,  7.58it/s]torch.Size([851, 2, 187]) tensor([851, 378], dtype=torch.int32) torch.Size([142]) tensor([91, 51], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▌                                                                                                                              | 1582/5000 [03:28<07:30,  7.58it/s]torch.Size([1045, 2, 187]) tensor([1045,  142], dtype=torch.int32) torch.Size([127]) tensor([118,   9], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▌                                                                                                                              | 1583/5000 [03:28<07:30,  7.58it/s]torch.Size([618, 2, 187]) tensor([618, 427], dtype=torch.int32) torch.Size([152]) tensor([119,  33], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▌                                                                                                                              | 1584/5000 [03:28<07:30,  7.58it/s]torch.Size([317, 2, 187]) tensor([183, 317], dtype=torch.int32) torch.Size([50]) tensor([15, 35], dtype=torch.int32)
torch.Size([330, 2, 187]) tensor([330, 121], dtype=torch.int32) torch.Size([39]) tensor([34,  5], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▋                                                                                                                              | 1586/5000 [03:29<07:30,  7.58it/s]torch.Size([785, 2, 187]) tensor([138, 785], dtype=torch.int32) torch.Size([72]) tensor([ 7, 65], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▋                                                                                                                              | 1587/5000 [03:29<07:30,  7.58it/s]torch.Size([373, 2, 187]) tensor([373, 178], dtype=torch.int32) torch.Size([64]) tensor([50, 14], dtype=torch.int32)
torch.Size([1010, 2, 187]) tensor([ 197, 1010], dtype=torch.int32) torch.Size([137]) tensor([ 24, 113], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▊                                                                                                                              | 1589/5000 [03:29<07:29,  7.58it/s]torch.Size([196, 2, 187]) tensor([165, 196], dtype=torch.int32) torch.Size([22]) tensor([ 7, 15], dtype=torch.int32)
torch.Size([1134, 2, 187]) tensor([1134,  134], dtype=torch.int32) torch.Size([169]) tensor([159,  10], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▊                                                                                                                              | 1591/5000 [03:29<07:29,  7.58it/s]torch.Size([743, 2, 187]) tensor([743, 191], dtype=torch.int32) torch.Size([159]) tensor([141,  18], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▉                                                                                                                              | 1592/5000 [03:30<07:29,  7.58it/s]torch.Size([416, 2, 187]) tensor([416, 294], dtype=torch.int32) torch.Size([78]) tensor([30, 48], dtype=torch.int32)
torch.Size([327, 2, 187]) tensor([327, 218], dtype=torch.int32) torch.Size([74]) tensor([37, 37], dtype=torch.int32)
training  :  32%|██████████████████████████████████████████████████████████▉                                                                                                                              | 1594/5000 [03:30<07:29,  7.58it/s]torch.Size([1083, 2, 187]) tensor([1083,  169], dtype=torch.int32) torch.Size([169]) tensor([149,  20], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████                                                                                                                              | 1595/5000 [03:30<07:29,  7.58it/s]torch.Size([712, 2, 187]) tensor([712, 150], dtype=torch.int32) torch.Size([97]) tensor([83, 14], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████                                                                                                                              | 1596/5000 [03:30<07:29,  7.58it/s]torch.Size([552, 2, 187]) tensor([302, 552], dtype=torch.int32) torch.Size([98]) tensor([24, 74], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████                                                                                                                              | 1597/5000 [03:30<07:28,  7.58it/s]torch.Size([929, 2, 187]) tensor([330, 929], dtype=torch.int32) torch.Size([133]) tensor([ 28, 105], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▏                                                                                                                             | 1598/5000 [03:30<07:28,  7.58it/s]torch.Size([879, 2, 187]) tensor([321, 879], dtype=torch.int32) torch.Size([173]) tensor([ 66, 107], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▏                                                                                                                             | 1599/5000 [03:31<07:28,  7.58it/s]torch.Size([1319, 2, 187]) tensor([1143, 1319], dtype=torch.int32) torch.Size([314]) tensor([150, 164], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▏                                                                                                                             | 1600/5000 [03:31<07:28,  7.57it/s]torch.Size([795, 2, 187]) tensor([795, 133], dtype=torch.int32) torch.Size([99]) tensor([94,  5], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▏                                                                                                                             | 1601/5000 [03:31<07:28,  7.57it/s]torch.Size([162, 2, 187]) tensor([162, 141], dtype=torch.int32) torch.Size([19]) tensor([14,  5], dtype=torch.int32)
torch.Size([294, 2, 187]) tensor([220, 294], dtype=torch.int32) torch.Size([44]) tensor([14, 30], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▎                                                                                                                             | 1603/5000 [03:31<07:28,  7.58it/s]torch.Size([231, 2, 187]) tensor([196, 231], dtype=torch.int32) torch.Size([18]) tensor([15,  3], dtype=torch.int32)
torch.Size([623, 2, 187]) tensor([623, 554], dtype=torch.int32) torch.Size([175]) tensor([107,  68], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▍                                                                                                                             | 1605/5000 [03:31<07:27,  7.58it/s]torch.Size([454, 2, 187]) tensor([454, 150], dtype=torch.int32) torch.Size([51]) tensor([46,  5], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▍                                                                                                                             | 1606/5000 [03:31<07:27,  7.58it/s]torch.Size([778, 2, 187]) tensor([778, 413], dtype=torch.int32) torch.Size([129]) tensor([80, 49], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▍                                                                                                                             | 1607/5000 [03:32<07:27,  7.58it/s]torch.Size([197, 2, 187]) tensor([130, 197], dtype=torch.int32) torch.Size([24]) tensor([ 3, 21], dtype=torch.int32)
torch.Size([198, 2, 187]) tensor([198, 144], dtype=torch.int32) torch.Size([47]) tensor([27, 20], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▌                                                                                                                             | 1609/5000 [03:32<07:27,  7.58it/s]torch.Size([339, 2, 187]) tensor([339, 328], dtype=torch.int32) torch.Size([68]) tensor([37, 31], dtype=torch.int32)
torch.Size([319, 2, 187]) tensor([205, 319], dtype=torch.int32) torch.Size([63]) tensor([17, 46], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▌                                                                                                                             | 1611/5000 [03:32<07:26,  7.59it/s]torch.Size([1028, 2, 187]) tensor([ 433, 1028], dtype=torch.int32) torch.Size([212]) tensor([ 67, 145], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▋                                                                                                                             | 1612/5000 [03:32<07:26,  7.58it/s]torch.Size([298, 2, 187]) tensor([298, 236], dtype=torch.int32) torch.Size([54]) tensor([35, 19], dtype=torch.int32)
torch.Size([575, 2, 187]) tensor([272, 575], dtype=torch.int32) torch.Size([107]) tensor([31, 76], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▋                                                                                                                             | 1614/5000 [03:32<07:26,  7.59it/s]torch.Size([350, 2, 187]) tensor([350, 206], dtype=torch.int32) torch.Size([74]) tensor([47, 27], dtype=torch.int32)
torch.Size([1264, 2, 187]) tensor([1264,  708], dtype=torch.int32) torch.Size([253]) tensor([134, 119], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▊                                                                                                                             | 1616/5000 [03:33<07:26,  7.58it/s]torch.Size([786, 2, 187]) tensor([196, 786], dtype=torch.int32) torch.Size([134]) tensor([ 28, 106], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▊                                                                                                                             | 1617/5000 [03:33<07:26,  7.58it/s]torch.Size([327, 2, 187]) tensor([171, 327], dtype=torch.int32) torch.Size([70]) tensor([21, 49], dtype=torch.int32)
torch.Size([498, 2, 187]) tensor([390, 498], dtype=torch.int32) torch.Size([77]) tensor([34, 43], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▉                                                                                                                             | 1619/5000 [03:33<07:25,  7.58it/s]torch.Size([697, 2, 187]) tensor([366, 697], dtype=torch.int32) torch.Size([141]) tensor([ 35, 106], dtype=torch.int32)
training  :  32%|███████████████████████████████████████████████████████████▉                                                                                                                             | 1620/5000 [03:33<07:25,  7.58it/s]torch.Size([229, 2, 187]) tensor([229, 176], dtype=torch.int32) torch.Size([69]) tensor([25, 44], dtype=torch.int32)
THCudaCheck FAIL file=/home/jbaik/setup/pytorch/pytorch/aten/src/THC/generic/THCTensorCopy.cpp line=70 error=77 : an illegal memory access was encountered
Traceback (most recent call last):
  File "/d1/jbaik/ics-asr/asr/densenet_ctc/model.py", line 79, in train_epoch
    loss = self.loss(ys_hat, frame_lens, ys, label_lens)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/nn/modules/module.py", line 468, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc-0.0.0-py3.6-linux-x86_64.egg/warpctc/__init__.py", line 102, in forward
    torch.is_grad_enabled())
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc-0.0.0-py3.6-linux-x86_64.egg/warpctc/__init__.py", line 22, in forward
    want_gradient)
RuntimeError: warp_ctc error: cuda memcpy or memset failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 24, in <module>
    densenet_ctc.train(argv)
  File "/d1/jbaik/ics-asr/asr/densenet_ctc/train.py", line 84, in train
    model.train_epoch(data_loaders["train"])
  File "/d1/jbaik/ics-asr/asr/densenet_ctc/model.py", line 84, in train_epoch
    print(ys_hat, frame_lens, ys, label_lens)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/tensor.py", line 57, in __repr__
    return torch._tensor_str._str(self)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/_tensor_str.py", line 251, in _str
    formatter = _Formatter(get_summarized_data(self) if summarize else self)
  File "/home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/_tensor_str.py", line 81, in __init__
    copy = torch.empty(tensor.size(), dtype=torch.float64).copy_(tensor).view(tensor.nelement())
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /home/jbaik/setup/pytorch/pytorch/aten/src/THC/generic/THCTensorCopy.cpp:70

I've printed activations.shape, act_lens, labels.shape, label_lens respectively. At the moment of error, the activations size is [229, 2, 187], the corresponding length is [229, 176], and the labels size is [69] and its length tensor is [25, 44]. All the tensor size is sufficiently small. I guess that there is some memory leak in the cpp code not to be freed correctly, so when the loop is on going, it failed the memory management. Do you have any ideas how to debug a python c++ extension?

t-vi commented 6 years ago

You can use GDB or valgrind if it is CPU memory (I think setting an environment DEBUG=1 works for PyTorch itself). PyTorch can report GPU memory consumption pretty well, or you could use the NVidia-provided tools. And if you save the tensors and pass them in a fresh process, it works? Also if you have a memory leak, you should be able to see it in nvidia-smi.

Three things I found out:

So one minor thing that is certainly wrong with your example is that you have blank (0) labels - they are part of the vocab for the activations, but I don't think you should have them in your targets.
I noticed is that there is a Baidu-originated CUDA 9 patch for warp-ctc itself that does some things (the cmake stuff is uninteresting, but the C/Cuda code might be relevant. Critically, it sets the mask to 0 rather than 0xffffffff for the newly _sync operations. Also the handling of small numbers might be relevant for correct results. https://github.com/baidu-research/warp-ctc/pull/118/files
My idea that something is up with the size when it gives an "unknown error" is that the only place I found the return code was in create_metadata_and_choose_config I haven't looked at the metadata creation in detail, though.

jinserk commented 6 years ago

I found this webpage: https://github.com/apache/incubator-mxnet/issues/7445

They discussed about the cudnn 7 ctc function, and interestingly, they said WarpCTC has the limit of label length as 639.

t-vi commented 6 years ago

I think you'd hit the numerical stability issues pretty hard, too. But I understand CuDNN has an even shorter limit?

jinserk commented 6 years ago

I'm not sure. But the one issue I'm facing is that, when I use CPU, ctc loss always throws segmentation fault error. Obviously there exists some memory management issue, but couldn't find it. When I'm using GDB with the command:

$ DEBUG=1 gdb --args /home/jbaik/.pyenv/versions/3.6.5/bin/python train.py densenet_ctc --use-cuda --batch-size 2
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/jbaik/.pyenv/versions/3.6.5/bin/python3.6...done.
(gdb) r
Starting program: /home/jbaik/.pyenv/versions/3.6.5/bin/python train.py densenet_ctc --use-cuda --batch-size 2
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Missing separate debuginfo for /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/numpy/core/../.libs/libgfortran-ed201abd.so.3.0.0
[New Thread 0x7fffed166700 (LWP 26137)]
[New Thread 0x7fffec965700 (LWP 26138)]
[New Thread 0x7fffea164700 (LWP 26139)]
[New Thread 0x7fffe7963700 (LWP 26140)]
[New Thread 0x7fffe5162700 (LWP 26141)]
[New Thread 0x7fffe2961700 (LWP 26142)]
[New Thread 0x7fffe0160700 (LWP 26143)]
[New Thread 0x7fffdd95f700 (LWP 26144)]
[New Thread 0x7fffdb15e700 (LWP 26145)]
[New Thread 0x7fffd895d700 (LWP 26146)]
[New Thread 0x7fffd615c700 (LWP 26147)]
[New Thread 0x7fffd395b700 (LWP 26148)]
[New Thread 0x7fffd115a700 (LWP 26149)]
[New Thread 0x7fffce959700 (LWP 26150)]
[New Thread 0x7fffcc158700 (LWP 26151)]
[New Thread 0x7fffc9957700 (LWP 26152)]
[New Thread 0x7fffc7156700 (LWP 26153)]
[New Thread 0x7fffc4955700 (LWP 26154)]
[New Thread 0x7fffc2154700 (LWP 26155)]
[New Thread 0x7fffbf953700 (LWP 26156)]
[New Thread 0x7fffbd152700 (LWP 26157)]
[New Thread 0x7fffba951700 (LWP 26158)]
[New Thread 0x7fffb8150700 (LWP 26159)]
[New Thread 0x7fffb594f700 (LWP 26160)]
[New Thread 0x7fffb314e700 (LWP 26161)]
[New Thread 0x7fffb094d700 (LWP 26162)]
[New Thread 0x7fffae14c700 (LWP 26163)]
[New Thread 0x7fffab94b700 (LWP 26164)]
[New Thread 0x7fffa914a700 (LWP 26165)]
[New Thread 0x7fffa6949700 (LWP 26166)]
[New Thread 0x7fffa4148700 (LWP 26167)]
[New Thread 0x7fffa1947700 (LWP 26168)]
[New Thread 0x7fff9f146700 (LWP 26169)]
[New Thread 0x7fff9c945700 (LWP 26170)]
[New Thread 0x7fff9a144700 (LWP 26171)]
[New Thread 0x7fff97943700 (LWP 26172)]
[New Thread 0x7fff95142700 (LWP 26173)]
[New Thread 0x7fff92941700 (LWP 26174)]
[New Thread 0x7fff90140700 (LWP 26175)]
[Thread 0x7fffe5162700 (LWP 26141) exited]
[Thread 0x7fff92941700 (LWP 26174) exited]
[Thread 0x7fffa6949700 (LWP 26166) exited]
[Thread 0x7fffdd95f700 (LWP 26144) exited]
[Thread 0x7fffb314e700 (LWP 26161) exited]
[Thread 0x7fff97943700 (LWP 26172) exited]
[Thread 0x7fffcc158700 (LWP 26151) exited]
[Thread 0x7fff95142700 (LWP 26173) exited]
[Thread 0x7fffc2154700 (LWP 26155) exited]
[Thread 0x7fff90140700 (LWP 26175) exited]
[Thread 0x7fffa914a700 (LWP 26165) exited]
[Thread 0x7fff9a144700 (LWP 26171) exited]
[Thread 0x7fffe2961700 (LWP 26142) exited]
[Thread 0x7fff9c945700 (LWP 26170) exited]
[Thread 0x7fffa1947700 (LWP 26168) exited]
[Thread 0x7fff9f146700 (LWP 26169) exited]
[Thread 0x7fffce959700 (LWP 26150) exited]
[Thread 0x7fffa4148700 (LWP 26167) exited]
[Thread 0x7fffae14c700 (LWP 26163) exited]
[Thread 0x7fffab94b700 (LWP 26164) exited]
[Thread 0x7fffba951700 (LWP 26158) exited]
[Thread 0x7fffb094d700 (LWP 26162) exited]
[Thread 0x7fffd615c700 (LWP 26147) exited]
[Thread 0x7fffb594f700 (LWP 26160) exited]
[Thread 0x7fffd395b700 (LWP 26148) exited]
[Thread 0x7fffb8150700 (LWP 26159) exited]
[Thread 0x7fffd895d700 (LWP 26146) exited]
[Thread 0x7fffbd152700 (LWP 26157) exited]
[Thread 0x7fffed166700 (LWP 26137) exited]
[Thread 0x7fffbf953700 (LWP 26156) exited]
[Thread 0x7fffc7156700 (LWP 26153) exited]
[Thread 0x7fffc4955700 (LWP 26154) exited]
[Thread 0x7fffe7963700 (LWP 26140) exited]
[Thread 0x7fffc9957700 (LWP 26152) exited]
[Thread 0x7fffec965700 (LWP 26138) exited]
[Thread 0x7fffd115a700 (LWP 26149) exited]
[Thread 0x7fffdb15e700 (LWP 26145) exited]
[Thread 0x7fffe0160700 (LWP 26143) exited]
[Thread 0x7fffea164700 (LWP 26139) exited]
Detaching after fork from child process 26176.
Missing separate debuginfo for /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/Pillow-5.1.0-py3.6-linux-x86_64.egg/PIL/.libs/libz-a147dcb0.so.1.2.3
Detaching after fork from child process 26290.
Detaching after fork from child process 26291.
warning: the debug information found in "/usr/lib/debug//lib64/libsox.so.2.0.1.debug" does not match "/lib64/libsox.so.2" (CRC mismatch).

warning: the debug information found in "/usr/lib/debug/usr/lib64/libsox.so.2.0.1.debug" does not match "/lib64/libsox.so.2" (CRC mismatch).

warning: the debug information found in "/usr/lib/debug//usr/lib64/libsox.so.2.0.1.debug" does not match "/lib64/libsox.so.2" (CRC mismatch).

warning: the debug information found in "/usr/lib/debug/usr/lib64//libsox.so.2.0.1.debug" does not match "/lib64/libsox.so.2" (CRC mismatch).

Detaching after fork from child process 26691.
begins logging to file: /d1/jbaik/ics-asr/logs/train.log
2018-06-15 11:09:13,058 [INFO] PyTorch version: 0.5.0a0+d769074
2018-06-15 11:09:13,058 [INFO] Training started with command: train.py densenet_ctc --use-cuda --batch-size 2
2018-06-15 11:09:13,058 [INFO] args: data_path=data/aspire num_workers=4 num_epochs=1000 batch_size=2 init_lr=1e-05 use_cuda=True seed=None log_dir=./logs model_prefix=dense_aspire continue_from=None
2018-06-15 11:09:13,059 [INFO] using cuda
[New Thread 0x7fff90140700 (LWP 26729)]
[New Thread 0x7fff92941700 (LWP 26730)]
[New Thread 0x7fff95142700 (LWP 26731)]
2018-06-15 11:09:14,979 [INFO] loading dataset manifest /d1/jbaik/ics-asr/data/aspire/train.csv ...
2018-06-15 11:09:20,665 [INFO] 10000 entries, 4051335 frames are loaded.
2018-06-15 11:09:21,208 [INFO] loading dataset manifest /d1/jbaik/ics-asr/data/aspire/dev.csv ...
2018-06-15 11:09:21,216 [INFO] 100 entries, 27800 frames are loaded.
[New Thread 0x7fff97943700 (LWP 26984)]
Detaching after fork from child process 26985.
Detaching after fork from child process 26986.
Detaching after fork from child process 26987.
Detaching after fork from child process 26988.
[New Thread 0x7fff3cede700 (LWP 26989)]
[New Thread 0x7fff311a1700 (LWP 26990)]
[New Thread 0x7fff309a0700 (LWP 26991)]
[New Thread 0x7fff1dfff700 (LWP 26992)]
[New Thread 0x7fff1d7fe700 (LWP 26993)]
training  :   0%|                                                                                                                                                                                                    | 0/5000 [00:00<?, ?it/s]
Program received signal SIGSEGV, Segmentation fault.
torch::autograd::VariableType::checked_cast_variable (t=..., name=name@entry=0x7fff83abf299 "self", pos=pos@entry=0) at torch/csrc/autograd/generated/VariableType.cpp:225
225   if (!t.defined()) {
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 glibc-2.17-222.el7.x86_64 gsm-1.0.13-11.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-19.el7.x86_64 libcom_err-1.42.9-12.el7_5.x86_64 libffi-3.0.13-18.el7.x86_64 libgcc-4.8.5-28.el7_5.1.x86_64 libgfortran-4.8.5-28.el7_5.1.x86_64 libgomp-4.8.5-28.el7_5.1.x86_64 libpng-1.5.13-7.el7_2.x86_64 libquadmath-4.8.5-28.el7_5.1.x86_64 libselinux-2.5-12.el7.x86_64 libstdc++-4.8.5-28.el7_5.1.x86_64 libtool-ltdl-2.4.2-22.el7_3.x86_64 libuuid-2.23.2-52.el7.x86_64 numactl-libs-2.0.9-7.el7.x86_64 openblas-0.2.20-6.el7.x86_64 openssl-libs-1.0.2k-12.el7.x86_64 pcre-8.32-17.el7.x86_64 sox-14.4.1-6.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  torch::autograd::VariableType::checked_cast_variable (t=..., name=name@entry=0x7fff83abf299 "self", pos=pos@entry=0) at torch/csrc/autograd/generated/VariableType.cpp:225
#1  0x00007fff8338d6f9 in torch::autograd::VariableType::unpack (t=..., name=name@entry=0x7fff83abf299 "self", pos=pos@entry=0) at torch/csrc/autograd/generated/VariableType.cpp:235
#2  0x00007fff83445e8f in torch::autograd::VariableType::softmax (this=0x7fffffffbbf0, self=..., dim=1457826560) at torch/csrc/autograd/generated/VariableType.cpp:26840
#3  0x00007fff435dcde1 in size (dim=1, this=0x7fffffffbbf0) at /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/lib/include/ATen/TensorMethods.h:1179
#4  ctc (activations=..., input_lengths=..., labels=..., label_lengths=..., blank_label=0, want_gradients=true) at src/_warpctc.cpp:54
#5  0x00007fff435e9f4a in call_impl<std::tuple<at::Tensor, at::Tensor>, std::tuple<at::Tensor, at::Tensor> (*&)(at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, bool), 0ul, 1ul, 2ul, 3ul, 4ul, 5ul, pybind11::detail::void_type> (
    f=<optimized out>, this=0x7fffffffbc40) at /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/lib/include/pybind11/cast.h:1866
#6  call<std::tuple<at::Tensor, at::Tensor>, pybind11::detail::void_type, std::tuple<at::Tensor, at::Tensor> (*&)(at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, bool)> (f=<optimized out>, this=0x7fffffffbc40)
    at /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/lib/include/pybind11/cast.h:1843
#7  operator() (call=..., __closure=0x0) at /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/lib/include/pybind11/pybind11.h:155
#8  _ZZN8pybind1112cpp_function10initializeIRPFSt5tupleIIN2at6TensorES4_EES4_S4_S4_S4_ibES5_IS4_S4_S4_S4_ibEINS_4nameENS_5scopeENS_7siblingEA4_cEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESQ_ (call=...)
    at /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/lib/include/pybind11/pybind11.h:132
#9  0x00007fff435e9f4a in cpp_function<std::tuple<at::Tensor, at::Tensor>, at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, bool, pybind11::name, pybind11::scope, pybind11::sibling, char [4]> (f=<optimized out>, this=<optimized out>)
   from /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/warpctc-0.0.0-py3.6-linux-x86_64.egg/warpctc/_warpctc.cpython-36m-x86_64-linux-gnu.so
#10 def<std::tuple<at::Tensor, at::Tensor> (*)(at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, bool), char [4]> (f=<optimized out>, name_=<optimized out>, this=<optimized out>, this=<optimized out>, this=<optimized out>, 
    this=<optimized out>) at /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/lib/include/pybind11/pybind11.h:806
#11 pybind11_init__warpctc (m=...) at src/_warpctc.cpp:137
#12 void pybind11::cpp_function::initialize<std::tuple<at::Tensor, at::Tensor> (*&)(at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, bool), std::tuple<at::Tensor, at::Tensor>, at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, bool, pybind11::name, pybind11::scope, pybind11::sibling, char [4]>(std::tuple<at::Tensor, at::Tensor> (*&)(at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, bool), std::tuple<at::Tensor, at::Tensor> (*)(at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, bool), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [4])::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) () at src/_warpctc.cpp:135
#13 0x00007fff435e7ba1 in pybind11::cpp_function::dispatcher (self=<optimized out>, args_in=0x7ffeb1a73468, kwargs_in=0x0) at /home/jbaik/.pyenv/versions/3.6.5/lib/python3.6/site-packages/torch/lib/include/pybind11/pybind11.h:619
#14 0x00007ffff79698ad in _PyCFunction_FastCallDict (func_obj=func_obj@entry=0x7fffb7adb900, args=args@entry=0x56e4bd58, nargs=<optimized out>, kwargs=kwargs@entry=0x0) at Objects/methodobject.c:231
#15 0x00007ffff7969b35 in _PyCFunction_FastCallKeywords (func=func@entry=0x7fffb7adb900, stack=stack@entry=0x56e4bd58, nargs=<optimized out>, kwnames=kwnames@entry=0x0) at Objects/methodobject.c:294
#16 0x00007ffff7a02afa in call_function (pp_stack=pp_stack@entry=0x7fffffffc100, oparg=oparg@entry=6, kwnames=kwnames@entry=0x0) at Python/ceval.c:4824
#17 0x00007ffff7a074a6 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3322
#18 0x00007ffff7a026fe in _PyEval_EvalCodeWithName (_co=0x7fffb02ec8a0, globals=<optimized out>, locals=locals@entry=0x0, args=args@entry=0x7ffee151a1e0, argcount=10, kwnames=kwnames@entry=0x0, kwargs=kwargs@entry=0x0, 
    kwcount=kwcount@entry=0, kwstep=kwstep@entry=2, defs=defs@entry=0x7fffb02f11c0, defcount=defcount@entry=5, kwdefs=kwdefs@entry=0x0, closure=closure@entry=0x0, name=name@entry=0x0, qualname=qualname@entry=0x0) at Python/ceval.c:4153
#19 0x00007ffff7a02d2d in PyEval_EvalCodeEx (_co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=args@entry=0x7ffee151a1e0, argcount=<optimized out>, kws=kws@entry=0x0, kwcount=kwcount@entry=0, 
    defs=defs@entry=0x7fffb02f11c0, defcount=defcount@entry=5, kwdefs=0x0, closure=0x0) at Python/ceval.c:4174
#20 0x00007ffff7942c96 in function_call (func=0x7fffb02ee8c8, arg=0x7ffee151a1c8, kw=0x0) at Objects/funcobject.c:604
#21 0x00007ffff79108aa in PyObject_Call (func=0x7fffb02ee8c8, args=<optimized out>, kwargs=<optimized out>) at Objects/abstract.c:2261
#22 0x00007fff83375408 in THPFunction_apply (cls=0x6d7098, inputs=0x7ffefad71228) at torch/csrc/autograd/python_function.cpp:738
#23 0x00007ffff796980e in _PyCFunction_FastCallDict (func_obj=func_obj@entry=0x7ffed97c2630, args=args@entry=0x56e4bb00, nargs=<optimized out>, kwargs=kwargs@entry=0x0) at Objects/methodobject.c:234
#24 0x00007ffff7969b35 in _PyCFunction_FastCallKeywords (func=func@entry=0x7ffed97c2630, stack=stack@entry=0x56e4bb00, nargs=<optimized out>, kwnames=kwnames@entry=0x0) at Objects/methodobject.c:294
#25 0x00007ffff7a02afa in call_function (pp_stack=pp_stack@entry=0x7fffffffc720, oparg=oparg@entry=9, kwnames=kwnames@entry=0x0) at Python/ceval.c:4824
#26 0x00007ffff7a074a6 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3322
#27 0x00007ffff7a01d90 in _PyFunction_FastCall (co=<optimized out>, args=<optimized out>, nargs=5, globals=<optimized out>) at Python/ceval.c:4906
#28 0x00007ffff7a0b4a6 in _PyFunction_FastCallDict (func=func@entry=0x7fffb02eeae8, args=args@entry=0x7fffffffc940, nargs=5, kwargs=kwargs@entry=0x7ffed97c29d8) at Python/ceval.c:5008
#29 0x00007ffff7910afe in _PyObject_FastCallDict (func=func@entry=0x7fffb02eeae8, args=args@entry=0x7fffffffc940, nargs=nargs@entry=5, kwargs=kwargs@entry=0x7ffed97c29d8) at Objects/abstract.c:2310
#30 0x00007ffff7910bee in _PyObject_Call_Prepend (func=0x7fffb02eeae8, obj=0x7fffb01a3c88, args=0x7ffefb0c5ef8, kwargs=0x7ffed97c29d8) at Objects/abstract.c:2373
#31 0x00007ffff79108aa in PyObject_Call (func=0x7ffee151ef08, args=<optimized out>, kwargs=<optimized out>) at Objects/abstract.c:2261
#32 0x00007ffff7a06dff in do_call_core (kwdict=0x7ffed97c29d8, callargs=0x7ffefb0c5ef8, func=0x7ffee151ef08) at Python/ceval.c:5093
#33 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3391
#34 0x00007ffff7a026fe in _PyEval_EvalCodeWithName (_co=_co@entry=0x7fffe71a50c0, globals=globals@entry=0x7fffe719fa68, locals=locals@entry=0x0, args=args@entry=0x7fffffffcd40, argcount=argcount@entry=5, kwnames=kwnames@entry=0x0, 
    kwargs=kwargs@entry=0x0, kwcount=kwcount@entry=0, kwstep=kwstep@entry=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x7ffff7f86170, qualname=0x7fffe71a28b0) at Python/ceval.c:4153
#35 0x00007ffff7a0b345 in _PyFunction_FastCallDict (func=func@entry=0x7fffe4ae18c8, args=args@entry=0x7fffffffcd40, nargs=5, kwargs=kwargs@entry=0x0) at Python/ceval.c:5057
#36 0x00007ffff7910afe in _PyObject_FastCallDict (func=func@entry=0x7fffe4ae18c8, args=args@entry=0x7fffffffcd40, nargs=nargs@entry=5, kwargs=kwargs@entry=0x0) at Objects/abstract.c:2310
#37 0x00007ffff7910bee in _PyObject_Call_Prepend (func=0x7fffe4ae18c8, obj=0x7fffb01a3c88, args=0x7ffefb0c5f48, kwargs=0x0) at Objects/abstract.c:2373
#38 0x00007ffff79108aa in PyObject_Call (func=0x7fffbcab09c8, args=<optimized out>, kwargs=<optimized out>) at Objects/abstract.c:2261
#39 0x00007ffff7986701 in slot_tp_call (self=self@entry=0x7fffb01a3c88, args=args@entry=0x7ffefb0c5f48, kwds=kwds@entry=0x0) at Objects/typeobject.c:6194
#40 0x00007ffff7910a0b in _PyObject_FastCallDict (func=0x7fffb01a3c88, args=<optimized out>, nargs=<optimized out>, kwargs=0x0) at Objects/abstract.c:2331
#41 0x00007ffff7a02868 in call_function (pp_stack=pp_stack@entry=0x7fffffffcfe0, oparg=oparg@entry=4, kwnames=kwnames@entry=0x0) at Python/ceval.c:4848
#42 0x00007ffff7a074a6 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3322
#43 0x00007ffff7a01d90 in _PyFunction_FastCall (co=<optimized out>, args=<optimized out>, nargs=2, globals=<optimized out>) at Python/ceval.c:4906
#44 0x00007ffff7a02cb4 in fast_function (kwnames=0x0, nargs=<optimized out>, stack=<optimized out>, func=0x7fffb02f9400) at Python/ceval.c:4941
#45 call_function (pp_stack=pp_stack@entry=0x7fffffffd220, oparg=oparg@entry=1, kwnames=kwnames@entry=0x0) at Python/ceval.c:4845
#46 0x00007ffff7a074a6 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3322
#47 0x00007ffff7a026fe in _PyEval_EvalCodeWithName (_co=0x7fffb02c1e40, globals=globals@entry=0x7fffbf1760d8, locals=locals@entry=0x0, args=<optimized out>, argcount=1, kwnames=0x0, kwargs=0x65b450, kwcount=0, kwstep=kwstep@entry=1, 
    defs=0x0, defcount=defcount@entry=0, kwdefs=kwdefs@entry=0x0, closure=0x0, name=name@entry=0x7ffff7e036c0, qualname=0x7ffff7e036c0) at Python/ceval.c:4153
#48 0x00007ffff7a02a12 in fast_function (kwnames=0x0, nargs=<optimized out>, stack=<optimized out>, func=0x7fffb02ee730) at Python/ceval.c:4965
#49 call_function (pp_stack=pp_stack@entry=0x7fffffffd4d0, oparg=oparg@entry=1, kwnames=kwnames@entry=0x0) at Python/ceval.c:4845
#50 0x00007ffff7a074a6 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3322
#51 0x00007ffff7a026fe in _PyEval_EvalCodeWithName (_co=_co@entry=0x7ffff7ef7ae0, globals=globals@entry=0x7ffff7f3a240, locals=locals@entry=0x7ffff7f3a240, args=args@entry=0x0, argcount=argcount@entry=0, kwnames=kwnames@entry=0x0, 
    kwargs=kwargs@entry=0x0, kwcount=kwcount@entry=0, kwstep=kwstep@entry=2, defs=defs@entry=0x0, defcount=defcount@entry=0, kwdefs=kwdefs@entry=0x0, closure=closure@entry=0x0, name=name@entry=0x0, qualname=qualname@entry=0x0)
---Type <return> to continue, or q <return> to quit---

and the back trace is much beyond my knowledge base. I just guessing some warpctc's module operation is not compatible with the new PyTorch's or something else..

t-vi commented 6 years ago

Hmm. I'm decidedly not seeing segfaults. Is that the same script you use? Did you use the exact same compiler between PyTorch and warp-ctc? Mixing versions gets you funny results. (I think you need to pass DEBUG=1 during compilation.)

t-vi / warp-ctc

RuntimeError: cuda memcpy or memset failed #4