imprecise result after max iterations

GoogleCodeExporter commented 8 years ago

What steps will reproduce the problem?
1. [frafra@rocketman shedskin-read-only]$ ./shedskin.py life.py
*** SHED SKIN Python-to-C++ Compiler 0.3 ***
Copyright 2005-2009 Mark Dufour; License GNU GPL version 3 (See LICENSE)

[iterative type analysis..]
  File "./shedskin.py", line 5, in <module>
    shedskin.main()
  File "/home/frafra/Scaricati/shedskin-read-only/shedskin/__init__.py",
line 90, in main
    infer.analyze(name)
  File "/home/frafra/Scaricati/shedskin-read-only/shedskin/infer.py", line
942, in analyze
    iterative_dataflow_analysis()
  File "/home/frafra/Scaricati/shedskin-read-only/shedskin/infer.py", line
689, in iterative_dataflow_analysis
    propagate()
  File "/home/frafra/Scaricati/shedskin-read-only/shedskin/infer.py", line
144, in propagate
    cpa(node, worklist)
  File "/home/frafra/Scaricati/shedskin-read-only/shedskin/infer.py", line
350, in cpa
    cp = cartesian_product(callnode, worklist)
  File "/home/frafra/Scaricati/shedskin-read-only/shedskin/infer.py", line
297, in cartesian_product
    funcs = possible_functions(node)
  File "/home/frafra/Scaricati/shedskin-read-only/shedskin/infer.py", line
188, in possible_functions
    objexpr, ident, direct_call, method_call, constructor, mod_var,
parent_constr = analyze_callfunc(expr, True)
  File "/home/frafra/Scaricati/shedskin-read-only/shedskin/shared.py", line
622, in analyze_callfunc
    traceback.print_stack()
*ERROR* life.py:None: unbound identifier 'None'

What version of the product are you using? On what operating system?
[frafra@rocketman shedskin-read-only]$ uname -a
Linux rocketman 2.6.32.3-21.fc13.x86_64 #1 SMP Mon Jan 11 16:53:56 UTC 2010
x86_64 x86_64 x86_64 GNU/Linux
[frafra@rocketman shedskin-read-only]$ svn up
Alla revisione 1165.

Please provide any additional information below.
I've attached the source. Now, I'm using shedskin because I would like to
run this program on my arm/linux little server, but using Python it goes so
slow :) Using multiprocessing, on a 4x4 table, my Intel Core 2 Duo E7400
needs about 10 seconds, but my arm-kirkwood (1.2ghz) needs 5.5 minutes :D
(~32 times slower).
It will ever introduced in shedskin multiprocessing? :D

Original issue reported on code.google.com by frap...@gmail.com on 19 Jan 2010 at 2:54

Attachments:

life.py

GoogleCodeExporter commented 8 years ago

thanks for reporting!

if you use 'defaultdict(int, ..)' instead of 'defaultdict(None, ..)' in 
'process', it
works (using SVN - I had to fix a minor issue, and after waiting the max nr of
iterations!). I will fix the 'defaultdict(None, ..)' problem for the next 
release of
shedskin.

note that the program actually becomes slower though, in part because 
allocating many
small tuples is not faster in C++, and possibly also because Shedskin dicts are 
a bit
slower than CPython dicts. to make it really fast, you might want to use 
integers for
positions (x*width)+y) and in general avoid memory allocations where you can.

shedskin probably won't ever support multiprocessing directly, but please see 
the
section of the tutorial on how to combine shedskin-generated extension modules 
with
multiprocessing solutions such as parallel python.i

I will keep this issue open until defaultdict(None, ..) works, and shedskin 
doesn't
need the maximum number of iterations (something I'd like to improve in general 
for 0.4).

Original comment by mark.duf...@gmail.com on 19 Jan 2010 at 5:15

Changed state: Accepted

GoogleCodeExporter commented 8 years ago

Thank you mark, you're always friendly :)

I've used None because makes Python a bit faster :) I've just see that Python3.x
wants always want something callable :)

Regarding the defaultdict: I've use it because I found it so much faster than 
use a
"2d list" in Python, checking if a index is value (for me) or not (so,
0<=index<len(mylist)).

Thanks for your attention and your always useful advices :)

Original comment by frap...@gmail.com on 19 Jan 2010 at 6:20

GoogleCodeExporter commented 8 years ago

see attachment for a somewhat shedskin-optimized version. the original life.py 
takes
44.4 seconds on this PC (test.py 4 4), while the compiled attachment takes 
about 10
seconds. I'm sure it can still be made faster, but I will leave you to puzzle 
over
that.. :) some ideas:

-keys() takes about 10% of the total time (plus perhaps 10% extra for putting
pressure on the GC).
-dict.__eq__ takes about 20% of the total time, for checking the history. 
typically
one uses hashing for this, or even zobrish hashing? doing a linear search and
comparing each time will become extremely slow for longer histories in any 
case..

Original comment by mark.duf...@gmail.com on 19 Jan 2010 at 8:03

Attachments:

l2.py

GoogleCodeExporter commented 8 years ago

Original comment by mark.duf...@gmail.com on 19 Jan 2010 at 8:04

Changed title: defaultdict(None, ..)

GoogleCodeExporter commented 8 years ago

Thanks too much!

I'm trying to make a module with shedskin. That's an example (but it doesn't 
work):

from collections import defaultdict
def go(board):
    """ Calculates the next stage """
    new = defaultdict(int, board)
    for row, column in board.keys():
        near = board[row-1, column-1] +\
            board[row-1, column] +\
            board[row-1, column+1] +\
            board[row, column-1] +\
            board[row, column+1] +\
            board[row+1, column-1] +\
            board[row+1, column] +\
            board[row+1, column+1]
        item = board[row, column]
        if near not in (2, 3) and item:
            new[row, column] = 0
        elif near == 3 and not item:
            new[row, column] = 1
    return new

if __name__ == "__main__":
    print go(defaultdict(int))

Now, it says something like "warning! I will not export this cool function :P". 
Do
you know where's my error?

Original comment by dad...@gmail.com on 19 Jan 2010 at 8:11

GoogleCodeExporter commented 8 years ago

defaultdicts are not exported at the moment. please see the tutorial, section
extension modules, limitations.. :-)

Original comment by mark.duf...@gmail.com on 19 Jan 2010 at 8:34

GoogleCodeExporter commented 8 years ago

[deleted comment]

GoogleCodeExporter commented 8 years ago

Ah, the code you've proposed me have a bug. I've used defaultdict because if I
request a negative row/column or a too hight row/column, it return int() (so 
0). With
l2.py code, it thinks that it could always add a piece, but's not true. It 
doesn't
care about the lenght of the row. So, I should add for every pieces, a 
condition, to
check if it's ok (so give me the value) or not (so give me 0).
defaultdict avoid this problem ('cause is a dict), and turns 0 it something 
doesn't
exists :)

Original comment by frap...@gmail.com on 19 Jan 2010 at 8:42

GoogleCodeExporter commented 8 years ago

I see. but I do think checking if something is between 0 and some maxvalue is 
faster
in C++ than using dicts..

Original comment by mark.duf...@gmail.com on 19 Jan 2010 at 9:09

GoogleCodeExporter commented 8 years ago

Yes, there's no doubt :) I'll try some hashing method to make it faster in the
history and to manage it better in c++. Thanks :)

Original comment by frap...@gmail.com on 19 Jan 2010 at 9:18

GoogleCodeExporter commented 8 years ago

see attachment for a somewhat better optimized version.. it becomes about 50 
times
faster here (from 48 to 1 second, using -O3 -msse2 -fomit-frame-pointer in 
FLAGS).
profiling shows 75% is spent in 'near', so a smarter datastructure for updating 
the
board can probably help a lot more (for example, by only visiting positions 
that can
possibly change for the next step).

Original comment by mark.duf...@gmail.com on 21 Jan 2010 at 11:24

Attachments:

l3.py

GoogleCodeExporter commented 8 years ago

Wow! It's only about three time slower of my C version (-O2 flag) :)
near() is so expensive. I've see that we could use a border of zeros, in order 
to not
use the if. It's a bit more expensive the management of the board, but the whole
program runs faster. So, I think that the emulation of a 2d "real borard" 
throught a
1d board with border, could be so much faster with shedskin. Another important 
point:
with big tables (as you already said), it would be better to have a smarter 
update
function.
Basically, I'll try to change the generator and the board management :)

Thanks :)

Original comment by frap...@gmail.com on 21 Jan 2010 at 2:25

GoogleCodeExporter commented 8 years ago

I've done some minor edit to use it with multiprocessing, but I can't compile 
it as
library (with -e), while if I try to compile it normally, it works. I don't 
know why.

Original comment by frap...@gmail.com on 21 Jan 2010 at 2:49

Attachments:

l3.py

GoogleCodeExporter commented 8 years ago

I also get some compiler warnings, but it seems to work fine here:

>>> import l3e
>>> l3e
<module 'l3e' from 'l3e.so'>
>>> b = l3e.Board(3,4)
>>> b
0000
0000
0000
>>> l3e.discover(b)
0000
0000
0000
>>>

do note that this is probably going to slow down the compiled version a lot 
(why not
keep them global?):

self.NEIGHBOURS = [(-1, -1), (-1, 0), (-1, 1), (0, -1), (0, 1), (1, -1), (1, 
0), (1, 1)]
self.POSITIONS = range(self.ROWS*self.COLUMNS)

Original comment by mark.duf...@gmail.com on 21 Jan 2010 at 3:10

GoogleCodeExporter commented 8 years ago

Yes, they slow down everything, but it's only a test, in order to use it with
multiprocess.Pool. Here, on Fedora rawhide, x86_64 doesn't work.

Original comment by frap...@gmail.com on 21 Jan 2010 at 3:51

Attachments:

log.txt

GoogleCodeExporter commented 8 years ago

can't say I've seen that before, on either x86 or x86-64.. this is an 
interesting read:

http://agiletesting.blogspot.com/2007/10/compiling-modpython-on-rhel-64-bit.html

do you have a libpython .so file in /usr/local/lib/python2.6/config/? did you
manually install python?

Original comment by mark.duf...@gmail.com on 21 Jan 2010 at 5:57

GoogleCodeExporter commented 8 years ago

I fixed the warnings in SVN, but probably not the error.

Original comment by mark.duf...@gmail.com on 21 Jan 2010 at 6:13

GoogleCodeExporter commented 8 years ago

oh and do you have python-dev installed?

Original comment by mark.duf...@gmail.com on 21 Jan 2010 at 6:16

GoogleCodeExporter commented 8 years ago

I've manually installed it. I've not libpython.so file in this directory.

Original comment by frap...@gmail.com on 21 Jan 2010 at 6:18

GoogleCodeExporter commented 8 years ago

alright, I'm happy to hear that. if I understand the link correctly, rebuilding 
it
with ./configure --enable-shared should fix the problem.

Original comment by mark.duf...@gmail.com on 21 Jan 2010 at 6:52

GoogleCodeExporter commented 8 years ago

attachment precomputes the neighbours of each square, for a good further speedup
(only about 35% left in 'near'). note that shedskin -bw also helps, and I made 
tuple
hashing a bit faster in SVN.

I also made a version that maintained a 'neighbour count' for each square, but 
that
didn't seem to help much on 4x4 or 4x5.. :/

Original comment by mark.duf...@gmail.com on 22 Jan 2010 at 5:51

Attachments:

l5.py

GoogleCodeExporter commented 8 years ago

did you try --enable-shared yet?

btw, I'm sure you know this one:

http://davidbau.com/archives/2006/07/26/python_curses_life.html

Original comment by mark.duf...@gmail.com on 23 Jan 2010 at 8:47

GoogleCodeExporter commented 8 years ago

I 'modernized' the defaultdict implementation in SVN, so for example
defaultdict(None, ..) now works.. unfortunately the original life.py now still 
needs
the max number of iterations, but even worse it doesn't have a correct estimate 
of
all the types at that point anymore. I will still look into this for 0.4, and
probably add the program to the example set. in any case, thanks for triggering 
me
into cleaning up defaultdict.. there were a lot of other things I had to clean 
up to
make this possible, so I'm quite happy with the result.

Original comment by mark.duf...@gmail.com on 23 Jan 2010 at 11:07

GoogleCodeExporter commented 8 years ago

Original comment by mark.duf...@gmail.com on 23 Jan 2010 at 11:09

Changed title: imprecise result after max iterations

GoogleCodeExporter commented 8 years ago

With --enable-shared all works, thanks :)
Yes, I know the curses_life example ;)
For me, l5 version needs 0.5 seconds (with shedskin) :) Using the new pure 
Python,
I'm at 2 seconds (so, abount 4 times slower, not bad), but I use 
hash(imap(...)) for
hashing, and I get fewer solutions :( With a correct hashing it needs 10 second,
without hashing, it's a bit faster. Hashing seems really important :)

Original comment by frap...@gmail.com on 23 Jan 2010 at 1:12

Attachments:

libpylife.py

GoogleCodeExporter commented 8 years ago

note that I use __eq__ so that it doesn't matter if there are hashing 
collisions.. is
there anything wrong with using classes btw..? ;)

Original comment by mark.duf...@gmail.com on 23 Jan 2010 at 2:18

GoogleCodeExporter commented 8 years ago

No, classes are usefull :) But I'm looking for a "classes-based" solution 
without
performance regression (because it looks slower).

Original comment by frap...@gmail.com on 23 Jan 2010 at 2:53

GoogleCodeExporter commented 8 years ago

note that classes are often slow in CPython (because of dynamic lookups), but 
really
fast after compilation to C++. an extreme example is examples/kaboodle in SVN, 
which
becomes about 350 times faster.

Original comment by mark.duf...@gmail.com on 23 Jan 2010 at 3:00

GoogleCodeExporter commented 8 years ago

I just improved things so defaultdicts can be passed to/from generated extension
modules. on return to cpython, they are still converted to regular non-default 
dicts
though (and shedskin warns about this), so it's not perfect yet. I will have 
another
look at this when it becomes easier to support.

Original comment by mark.duf...@gmail.com on 6 Mar 2010 at 4:53

GoogleCodeExporter commented 8 years ago

I will close this issue when examples/life.py compiles correctly.. hopefully 
for 0.4,
RSN.

Original comment by mark.duf...@gmail.com on 18 Mar 2010 at 5:16

GoogleCodeExporter commented 8 years ago

Original comment by mark.duf...@gmail.com on 18 Mar 2010 at 5:16

Changed state: Started

GoogleCodeExporter commented 8 years ago

alright, it took a while, but examples/life.py now works fine! the solution was 
to
randomize TI a bit, to lower the chance of it getting 'stuck' at any point. I'm 
quite
happy with the improvement, because it is quite general and only took 1 extra 
line in
the end! so thanks again for triggering this. now I'm ready to release 0.4.. :-)

Original comment by mark.duf...@gmail.com on 24 Mar 2010 at 4:11

Changed state: Fixed

cy99 / shedskin

imprecise result after max iterations #54