Closed felagund closed 10 years ago
I read on the internet that str()
should not be used when dealing with unicode and .encode()
and .decode()
should be used. This poses two problems:
1) Python 3 compatibility
2) SrtItem does not know about encoding of its .text
attribute as far as I can see
SrtItem does not know about encoding of its .text attribute as far as I can see
Yes it's by design. SubRipFile deal with all the encoding mess, but SubRipItem is fully unicode and abstracted from encoding stuffs. So SubRipItem.str() in Python 2 do not make sense. Actually before the merge of py2 and 3 code bases that method did not exist.
I guess we have no other choices than to have two different implementations, and to conditionally define them based on python version, something like:
class SubRipItem:
if is_py3:
def __str__(self):
#actual implementation
else:
def __unicode__(self):
# same implementation
def __str__(self):
raise NotImplementedError
Would not it be better if __str__(self)
pointed to unicode(self)? Only we would need to know which encoding to use, right, which we don't by design, if I understand it correctly.
Then, should not
self.position = str(position)
self.text = str(text)
in __init__
method of SubRipItem be
self.position = unicode(position)
self.text = unicode(text)
instead?
And in write_into
method of SubRipFile
, we will need to call either unicode(item) or str(item) based on whether we are using Python 2 or 3, right?
we will need to call either unicode(item) or str(item) based on whether we are using Python 2 or 3, right?
It's already the case: https://github.com/byroot/pysrt/blob/master/pysrt/compat.py#L18
Running
Gives me: