poppopjmp / shedskin

Automatically exported from code.google.com/p/shedskin
0 stars 0 forks source link

.decode and .encode methods for strings missing? #62

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?

I did simple test after having problems in my code:
# -*- coding: cp850 -*-
r=raw_input('Anna ääkköstekstiä') 
r=r.decode('cp850')
print r
r=r.encode('latin1')
print r

What is the expected output? What do you see instead?
Using with python:
Anna ääkköstekstiä
Äärimmäisen työlästä
Äärimmäisen työlästä
├ä├ñrimm├ñisen ty├Âl├ñst├ñ

with shedskin:
*WARNING* kokeilu.py:3: class 'str' has no method 'decode'
*WARNING* kokeilu.py:5: class 'str' has no method 'encode'

What version of the product are you using? On what operating system?
I used shedskin 0.3.1 on Windows XP

Please provide any additional information below.

Original issue reported on code.google.com by tony.vei...@gmail.com on 27 Mar 2010 at 8:22

Attachments:

GoogleCodeExporter commented 8 years ago
thanks for reporting. unfortunately, I'm not sure how difficult it would be to
support this, nor do I think I would be working on it myself.. but I'm guessing 
most
functionality can be simply mapped to standard C libraries. so if anyone would 
like
to give this a try.. 

btw, please note that windows support has been dropped as of shed skin 0.4..

Original comment by mark.duf...@gmail.com on 28 Mar 2010 at 11:05

GoogleCodeExporter commented 8 years ago
this issue was marked as an 'easy task' in the wiki section.

Original comment by mark.duf...@gmail.com on 14 Nov 2010 at 12:00

GoogleCodeExporter commented 8 years ago
I ran into this trying to convert "httplib.py" to Shed Skin.  Doing anything 
useful on the web requires codecs. 

Original comment by na...@animats.com on 18 Nov 2010 at 7:38

GoogleCodeExporter commented 8 years ago
So python's encoding ability is very intense. I've been trying to decipher it 
for the past few hours and it requires shedskin'ing 120 files, so I'm gonna let 
my computer just try each one and I'll look through the output later.

I don't think it'll directly require anything dynamic, but I've already seen 
use of *args so that's be fun...

Original comment by fahh...@gmail.com on 17 Dec 2010 at 4:01

GoogleCodeExporter commented 8 years ago
woah. I was actually hoping this could be done using some standard C library 
calls..

Original comment by mark.duf...@gmail.com on 17 Dec 2010 at 4:14

GoogleCodeExporter commented 8 years ago
So python's encoding ability is very intense. 

I don't think it'll directly require anything dynamic, but I've already seen 
use of *args, inheriting from tuple (and other 'builtin' classes), __getattr__, 
etc.

Inheriting from builtin classes, from tuple to CodecInfo (a class that will be 
built-in once this is added) is necessary as the encoding code works through 
that mechanism. Why is inheriting from builtin classes not allowed while newly 
defined classes are allowed, especially since I could just copy-paste the 
builtin class's code into the target file and be allowed to subclass?

Original comment by fahh...@gmail.com on 17 Dec 2010 at 4:15

GoogleCodeExporter commented 8 years ago
yeah, the standard libraries are a 'fun' source of overengineered solutions..

it doesn't sound like it would be easy to get all of this to work, and perhaps 
a lot of this is not needed in your typical case.. can't we support the most 
important functionality by just relaying stuff to some common C library..?

Original comment by mark.duf...@gmail.com on 17 Dec 2010 at 4:20

GoogleCodeExporter commented 8 years ago
I don't know of many C++ libraries by heart and couldn't find any for 
encoding/decoding strings through a Google search. However, the actual code is 
pretty simple for many of the encoders (cp*.py) which basically have 
encoding/decoding 'tables' which are really just tuples. The fancier ones, like 
bzip2 and base64 are just function mappings to the appropriate functions in 
their modules:

def base64_encode(input,errors='strict'):
    output = base64.encodestring(input)
    return (output, len(input))
class Codec(codecs.Codec):
    def encode(self, input,errors='strict'):
        return base64_encode(input,errors)

Inheriting from codecs.Codec isn't necessary since the class is really just an 
interface, but some of the other classes inherit from IncrementalEncoder for 
it's __init__. I could just copy-paste the __init__ into all the others, or 
just cut-paste IncrementalEncoder itself, I'm still not sure why inheriting 
from builtins isn't allowed...

Original comment by fahh...@gmail.com on 17 Dec 2010 at 5:11

GoogleCodeExporter commented 8 years ago
I don't think it's a good idea to inherit from builtins such as list or dict. 
it will complicate already complicated things, and moreover make it harder to 
optimize for lists and dicts.. so similar in a way to multiple inheritance and 
dynamic typing, it feels like a natural restriction for shedskin to not allow 
this in the general case. we can make an exception though for cases where it 
makes sense, and is unlikely to cause problems, and shedskin currently does so 
for 'object' and descendants of 'Exception' already, and we could probably add 
something of a 'Codec' class to this.

Original comment by mark.duf...@gmail.com on 18 Dec 2010 at 12:03

GoogleCodeExporter commented 8 years ago
This is still an issue in shedskin 0.7.1

Original comment by tur...@gmail.com on 26 Apr 2011 at 4:20