cy99 / shedskin

Automatically exported from code.google.com/p/shedskin
0 stars 0 forks source link

Uncaught KeyError when compiling #22

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I am trying to compile BeautifulSoup and therefore sgmllib.py and then
markupbase.py (the latter 2 from the Python standard lib).

With some modifications I can get markupbase.py to generate C++ code. 
However I couldn't get past this with sgmllib -- not sure what the problem is.

Attached is the diff of all 3 files.   BeautifulSoup is a very useful
library, it's a bit slow, and it's all text processing, so I think it would
make a very good test case for shedskin.  (Great project BTW!)

There are some other bugs I encountered as well (which you can see in the
diffs):

- RuntimeError not supported
- self.foo is not like self.__class__.foo as it is in Python
- *args not supported
- from __future__ import generators not supported (it would be nice to just
ignore this or something, a lot of code has it for compatibility)

shedskin-read-only$ ./shedskin sgmllib.py
*** SHED SKIN Python-to-C++ Compiler 0.0.29 ***
Copyright 2005-2008 Mark Dufour; License GNU GPL version 3 (See LICENSE)
(Please send bug reports here: mark.dufour@gmail.com)

Traceback (most recent call last):
  File "/home/andy/svn/shedskin-read-only/ss.py", line 482, in <module>
    main()
  File "/home/andy/svn/shedskin-read-only/ss.py", line 477, in main
    analysis(name)
  File "/home/andy/svn/shedskin-read-only/ss.py", line 81, in analysis
    gx.main_module = parse_module(gx.main_mod, ast)
  File "/home/andy/svn/shedskin-read-only/graph.py", line 1540, in parse_module
    mv.dispatch(mod.ast)
  File "/home/andy/svn/shedskin-read-only/graph.py", line 56, in dispatch
    ASTVisitor.dispatch(self, node, *args)
  File "/usr/lib/python2.5/compiler/visitor.py", line 57, in dispatch
    return meth(node, *args)
  File "/home/andy/svn/shedskin-read-only/graph.py", line 324, in visitModule
    self.visitFunction(func_copy, cl, inherited_from=ancestor)
  File "/home/andy/svn/shedskin-read-only/graph.py", line 484, in visitFunction
    self.visit(node.code, func)
  File "/home/andy/svn/shedskin-read-only/graph.py", line 56, in dispatch
    ASTVisitor.dispatch(self, node, *args)
  File "/usr/lib/python2.5/compiler/visitor.py", line 57, in dispatch
    return meth(node, *args)
  File "/home/andy/svn/shedskin-read-only/graph.py", line 241, in visitStmt
    self.visit(b, func)
  File "/home/andy/svn/shedskin-read-only/graph.py", line 56, in dispatch
    ASTVisitor.dispatch(self, node, *args)
  File "/usr/lib/python2.5/compiler/visitor.py", line 57, in dispatch
    return meth(node, *args)
  File "/home/andy/svn/shedskin-read-only/graph.py", line 1118, in visitAssign
    self.tuple_flow(lvalue, rvalue, func)
  File "/home/andy/svn/shedskin-read-only/graph.py", line 1172, in tuple_flow
    tvar = self.tempvar(lvalue, func)
  File "/home/andy/svn/shedskin-read-only/graph.py", line 790, in tempvar
    varname = self.tempcount[getgx().parent_nodes[node]]
KeyError: AssTuple([AssName('sectName', 'OP_ASSIGN'), AssName('j',
'OP_ASSIGN')])

Original issue reported on code.google.com by andyc...@gmail.com on 14 Sep 2008 at 5:28

Attachments:

GoogleCodeExporter commented 9 years ago
hi andy, 

thanks!

you would be surprised at how fast the CPython builtins are :) especially for
strings, there's not much to optimize - being a dynamic language has no real 
effect
on string operations. Shed Skin will probably only help when the bottleneck is 
in
user code (and it often is), not in builtins. for builtins, matching CPython in 
speed
is a challenge!

so really, compiling htmllib and such, I don't think there's a lot of value in 
it,
outside of finding bugs and locating minor features to add (this is very useful
though!!). note that you can always generate extension modules (shedskin -e), 
and use
arbitrary Python modules and dynamic constructs in the 'main' program.

it's a bit of a hassle, but I'd like to ask you to please report issues 
separately,
and to discuss officially unsupported features in the google group instead. if 
you
like, it would also be very useful to attach a minimized version of the program 
at
hand for each issue you submit. this can save me a lot of time..

Original comment by mark.duf...@gmail.com on 15 Sep 2008 at 9:24

GoogleCodeExporter commented 9 years ago
okay, I admit this issue is really about the uncaught KeyError.. I got a bit
distracted by the other issues you mention. would it be possible to minimize the
program, to maybe 10 or 20 lines, while still observing the same crash..?

Original comment by mark.duf...@gmail.com on 15 Sep 2008 at 9:27

GoogleCodeExporter commented 9 years ago
btw, for what code do you observe the RuntimeError problem..?

ignoring __future__ imports might be a good idea, will investigate. 

Original comment by mark.duf...@gmail.com on 15 Sep 2008 at 9:28

GoogleCodeExporter commented 9 years ago
I didn't get a line number from shedskin so it's hard for me to tell exactly 
where
the problem lies.  Can you not reproduce it from the attached sgmllib.py file?

The RuntimeError is also in the same code -- see diff.txt.  It complained that
inheritance from RuntimeError is not supported, so I temporarily changed it to
Exception, and it worked.

And I'm a little confused by your first response.  A lot of the test programs 
do a
lot of manipulation on integers in lists (e.g. mandelbrot program), and they 
have
huge speedups.

You're saying the same thing doesn't apply for strings?  Maybe because you have 
to
maintain the immutable semantics of the Python program?  I would think that 
shedskin
would at least speed up the attribute lookups and function calls.

Also, markupbase and sgmllib are pure Python code in the standard library.  
They're
not really "builtins".  I don't see the distinction between "user code" and 
these
modules -- it's all Python code.

But in any case, wouldn't it be useful to have say HTMLParser and therefore 
sgmllib
supported in the standard library by shedskin?  I saw you checked in the 
generated
code for ConfigParser -- this seems like the same idea.  Even if it's not that 
much
faster, it still seems useful to have it supported.

Original comment by andyc...@gmail.com on 15 Sep 2008 at 1:39

GoogleCodeExporter commented 9 years ago
it can take quite some time to minimize a program, because bugs can be quite 
subtle,
and there typically is no specific line number that you can point to.. so this 
can
save me a lot of time.

a simple way to minimize a program is to keep removing parts, until you cannot 
make
it smaller without the problem disappearing.

for strings and sets, dynamic lookups are only a small part of the whole, as 
opposed
to integers, where the actual operation takes maybe only 1 cpu cycle.. the 
mandelbrot
test does mostly float operations.

programs like sgmllib probably spend most of their time inside string 
operations, so
compilation will probably only result in a slowdown.. :)

Original comment by mark.duf...@gmail.com on 16 Sep 2008 at 3:14

GoogleCodeExporter commented 9 years ago
about ConfigParser, yes it won't be faster either. I guess I added support 
mostly
because it was fun to do, and motivated me to fix some issues. but I do think
ConfigParser can be more useful in small computational projects than e.g. 
sgmllib. 

in any case, I'd be happy to look into issues found by trying to compile 
sgmllib and
such.

Original comment by mark.duf...@gmail.com on 16 Sep 2008 at 7:11

GoogleCodeExporter commented 9 years ago
I'd still be very interested in receiving minimized code fragments that crash 
the
compiler.

note that inheriting from RuntimeError should now be supported (in SVN).

Original comment by mark.duf...@gmail.com on 28 Nov 2008 at 12:52

GoogleCodeExporter commented 9 years ago
issue fixed / no minimized test-case / unsupported feature as per tutorial..

Original comment by mark.duf...@gmail.com on 29 Mar 2009 at 7:20