ottokart / punctuator2

A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text
http://bark.phon.ioc.ee/punctuator
MIT License
659 stars 195 forks source link

can not execute error_calculator.py #26

Closed kadir-gunel closed 6 years ago

kadir-gunel commented 6 years ago

Hello,

I am getting error when I try to run error_calculator.py script : Traceback (most recent call last): File "error_calculator.py", line 147, in <module> compute_error([target_path], [predicted_path]) File "error_calculator.py", line 38, in compute_error target_stream = target.read().split() File "/usr/lib/python2.7/codecs.py", line 686, in read return self.reader.read(size) File "/usr/lib/python2.7/codecs.py", line 492, in read newchars, decodedbytes = self.decode(data, self.errors) UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: invalid start byte

I tried to change the code from:

with codecs.open(target_path, 'r', encoding='utf-8') as target, codecs.open(predicted_path, 'r', encoding='utf-8') as predicted:

to

with codecs.open(target_path, 'r', encoding='utf-8') as target, codecs.open(predicted_path, 'r', encoding='utf-8', errors='ignore') as predicted:

but this time I get : " ".join(predicted_stream[p_i-2:p_i+2])) AssertionError: <exception str() failed> Any suggestions ?

ashutosh486 commented 3 years ago

Hello, I am getting this error:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-19-047b7b92f34b> in <module>
      1 target_path = 'data/gt.txt'
      2 predicted_path = 'result/pred.txt'
----> 3 compute_error([target_path], [predicted_path])

<ipython-input-18-be6259310758> in compute_error(target_paths, predicted_paths)
     81                         target_stream[t_i], t_i, predicted_stream[p_i], p_i,
     82                         " ".join(target_stream[t_i-2:t_i+2]),
---> 83                         " ".join(predicted_stream[p_i-2:p_i+2]))
     84 
     85                 t_i += 1

AssertionError: File: data/gt.txt 
Error: tape (5) != tape. (5) 
Target context: start of tape labeled 
Predicted context: start of tape. labeled

I am providing two text file one with ground truth and other with a prediction from punctuate. How to resolve this error?

hs79hs commented 3 years ago

I got the same error, why was this closed without any suggestion? Did you have a solution? @kadir-gunel , thanks

kadir-gunel commented 3 years ago

Hello @hs79hs , I opened the issue more than 2 years ago, unfortunately I don't recall anything.