Open cathoderaymission opened 5 years ago
vocab = ["A", "B", "C", "D", " "] decoder = CTCBeamDecoder(vocab, beam_width=5, blank_id=vocab.index(' '), log_probs_input=True) decoder.decode(out) py3/lib/python3.7/site-packages/ctcdecode/__init__.py in decode(self, probs, seq_lens) 38 ctc_decode.paddle_beam_decode(probs, seq_lens, self._labels, self._num_labels, self._beam_width, self._num_processes, 39 self._cutoff_prob, self.cutoff_top_n, self._blank_id, self._log_probs, ---> 40 output, timesteps, scores, out_seq_len) 41 42 return output, scores, timesteps, out_seq_len RuntimeError: Invalid UTF-8
Where out is a tensor of shape [batch, seq, probs] eg torch.Size([400, 300, 5])
I've tried smaller beam widths and using one sample instead of an entire batch, and I still can't get this to work.
Nothing in the code really provides much of an indication as to why I'm getting this error.
Where out is a tensor of shape [batch, seq, probs] eg torch.Size([400, 300, 5])
I've tried smaller beam widths and using one sample instead of an entire batch, and I still can't get this to work.
Nothing in the code really provides much of an indication as to why I'm getting this error.