Closed limpbrains closed 11 years ago
Strange, I'm able to shift it without encoding error.
srt shift 20 russian.srt
Can you paste the whole command you typed ?
Well, a month without reply -> I close this issue.
Feel free to reopen it if you still have a problem.
Hi, sorry for the long responce
srt shift 40s 33.srt
Traceback (most recent call last):
File "/usr/local/bin/srt", line 9, in
python -V Python 2.7.2+
lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 11.10 Release: 11.10 Codename: oneiric
Hum, very strange... so it always happen whatever the subtitle file ?
And how did you installed it ? Beacause /data/share/_films/Game of Thrones_S02E02/src/
is a very strange location...
I've only tried on a few files, all russian, UTF8. installed from git pip install -e git+https://github.com/byroot/pysrt.git#egg=pysrt
Ok, I still can't reproduce but now I'm almost sure that it's a BOM issue...
I will ask a friend on ubuntu to test that
Did you tried the version released on PyPI ?
pip install --upgrade pysrt
I confirm it is a BOM issue. I've successfully edited file without BOM created with notepad++ also I've tried the following command srt -e utf_8_sig ... but failed with same error
Pysrt is supposed to handle BOM correctly...
And the file you gived to me is in cp1252, why did it have an utf-8 BOM ? Can you send me another file again ?
I'm having the same issue File is here: https://docs.google.com/open?id=0B2q9iBGZdj6qN29uUzBBQXNJM2c
I finally found the issue, it was because chardet returned "UTF-8"
and the encodings
module was only aware of "utf-8"
.
My bad ...
Is this fixed in 0.4.4? Because I still have this error
I Think so. You still have the issue with this same file and pysrt 0.4.4 ?
Oh shit ... confirmed, I'll fix that right now.
Oh, I just forgot to release ...
0.4.5 released with the fix.
Thanks, that was fast :)
I'm still having an error :cry:
I added a print statement to see what's in lines
here and I got this:
[u'\ufeff1\r\n', u'00:00:01,677 --> 00:00:04,145\r\n', u'Alors, sur quel genre de croisi\xe8re\r\n', u'allez-vous embarquer ?\r\n']
Of course int(u'\ufeff1\r\n')
fails
File can be downloaded on Addic7ed
Sample code to reproduce the error:
from charade.universaldetector import UniversalDetector
import codecs
import pysrt
def is_valid_subtitle(path):
u = UniversalDetector()
for line in open(path, 'rb'):
u.feed(line)
u.close()
encoding = u.result['encoding']
source_file = codecs.open(path, 'rU', encoding=encoding, errors='replace')
try:
for _ in pysrt.SubRipFile.stream(source_file, error_handling=pysrt.SubRipFile.ERROR_RAISE):
pass
except pysrt.Error as e:
if e.args[0] < 50: # Error occurs within the 50 first lines
return False
# except UnicodeEncodeError: # Workaround for https://github.com/byroot/pysrt/issues/12
# pass
return True
Oh ! it make sense now. If you open the file yourself pysrt do not strip the BOM.
Anyway chardet is integrated inside pysrt now.
Try something like:
def is_valid_subtitle(path):
source_file = pysrt.SubRipFile._open_unicode_file(path)
try:
for _ in pysrt.SubRipFile.stream(source_file, error_handling=pysrt.SubRipFile.ERROR_RAISE):
pass
except pysrt.Error as e:
if e.args[0] < 50: # Error occurs within the 50 first lines
return False
# except UnicodeEncodeError: # Workaround for https://github.com/byroot/pysrt/issues/12
# pass
return True
I can't run srt with this file http://dl.dropbox.com/u/1788271/Bones.S07E01.HDTVRip.srt It is cp1251 I have the following error: