liuhu-bigeye / enctc.crnn

Project for Connectionist Temporal Classification with Maximum Entropy Regularization.
MIT License
143 stars 40 forks source link

`en_ctc` can not work fine with pytorch 1.0 #1

Closed the-butterfly closed 5 years ago

the-butterfly commented 5 years ago

When use pytorch 1.0, the loss function may not work. Like following:

from seg_ctc_ent_log_fb import seg_ctc_ent_cost as seg_ctc_ent_cost
import torch

txt1 = torch.randint(1,5825,(640,),dtype=torch.int32)
preds = torch.rand(35,64,5825).cuda()
len1 = torch.tensor([10]*64, dtype=torch.int32)
psize1 = torch.tensor([preds.size(0)]*64, dtype=torch.int32)
H, cost = seg_ctc_ent_cost(preds, txt1, psize1, len1, uni_rate=1.5)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-11-732ba78f19ab> in <module>()
----> 1 H, cost = seg_ctc_ent_cost(preds, txt1, psize1, len1, uni_rate=1.5)

~/Workspace/Remote/TestChars/pytorch_ctc/seg_ctc_ent_log_fb.py in seg_ctc_ent_cost(out, targets, sizes, target_sizes, uni_rate)
     57         H, costs = loss_func(pred.cpu(), sizes.data.type(longX), target, target_sizes.data.type(longX), uniform_mask)
     58     else:
---> 59         H, costs = loss_func(pred, sizes.data.type(longX), target, target_sizes.data.type(longX), uniform_mask)
     60     return H.sum(), costs.sum()
     61 

~/Workspace/Remote/TestChars/pytorch_ctc/seg_ctc_ent_log_fb.py in seg_ctc_ent_loss_log(pred, pred_len, token, token_len, uniform_mask, blank)
    164                                            + betas[2:t+1, :, :, -1][:, te_b, 1+te_u].clone(),
    165                                            uniform_mask[-t+1:][:, te_b],
--> 166                                            dim=0).clone() if t >= 2 else eps_nan
    167             alphas_t_ent[te_b, 1+te_u] = log_sum_exp(
    168                                             log_sum_exp_axis(pred_blank[1:t][:, te_b] + \

~/Workspace/Remote/TestChars/pytorch_ctc/m_ctc.py in log_sum_exp_axis(a, uniform_mask, dim)
     99     eps_nan = -1e8
    100     eps = 1e-26
--> 101     _max = T.max(a, dim=dim)[0]
    102 
    103     if not uniform_mask is None:

RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity

BUT it works pretty good in pytorch v0.4.1, Can anyone help? @liuhu-bigeye @jin-s13

the-butterfly commented 5 years ago

emmm…… The problem is due to the different behavior of torch.nonzero() in Pytorch 1.0. And just change file seg_ctc_ent_log_fb.py in line 119, replace it by:

# if len(token_equals.size()) == 2:   previous
if token_equals.size(0) != 0:           # compatible for pytorch 1.0