Closed GoogleCodeExporter closed 8 years ago
I'm really not a big fan of type declarations in any form (I actually started
shedskin to avoid them.. :P), and judging from the examples/ dir, I think
usually it's not that bad. though I'd be interested to see what you end up
with..
for speed, I would much prefer to use a profiler or storing analysis results
between compilation sessions to aid type inference (see the latest thread in
the discussion group).
Original comment by mark.duf...@gmail.com
on 19 Aug 2010 at 8:25
I'm sure it's not bad in the examples directory, but examples are generally by
their nature small. This would help more when things are large.
It's not like I don't have to do extra type declarations--it's just that this
way I could do small, targeted declarations where they'd do the most good,
instead of constructing artificial main functions of
type-correct-but-generally-garbage pseudo-code down at the bottom of the file.
I construct code that, while wrong, provides enough signals that type inference
can figure out what's what. That's just a big, ugly, inefficient type
declaration.
Anyway, this is just one possibility, though I'll bet it'd be much simpler to
get in than that live type analysis you were posting about. Granted, the live
type analysis will probably be more fun to code.
But would live type analysis would only help once a program was fully
shedskin-compatible? I'd like to have the type hints to help me in my porting.
Currently if I haven't quite covered things in my pseudo-main, sometimes it's
hard to figure out what I've messed up. This would make it easier to localize
the problem case, like the optional type restrictions in Haskell. Without
them, you might look all day for the real problem, since the system can't tell
where the true error is, only where it was when it hit a contradiction.
Original comment by uran...@gmail.com
on 19 Aug 2010 at 8:42
Original comment by mark.duf...@gmail.com
on 21 Aug 2010 at 8:15
well, the examples are over 10,000 lines in total (sloccount). I guess the
problem mostly occurs when generating extension modules, where the logic of
calling things occurs later, on the CPython side. a simple workaround could be
to move enough logic to the extension module, for type inference not to need
the ugly 'type models'. like loading a scene from disk and starting the actual
raytracing.
I agree that looking at assert statements and such could avoid type models in
some cases (though not all - what if a function also accepts a list-of-int
argument? or an int? '1' is shorter than 'assert isinstance(arg, int)'), but
again I'm not really interested in going down this path at this point. it just
doesn't look like it's worth the effort and added complexity.
I'm not sure I understand your point about the live analysis, but of course a
tool that can annotate python source code with types could be very useful also
without shedskin ;)
Original comment by mark.duf...@gmail.com
on 28 Aug 2010 at 10:37
Currently, many of the programs I write and use shedskin on have several lines
that are basically type declarations, but get compiled into C++ anyway. For
instance, shedskin was having trouble understanding what a list comprehension
outputs, so I had to add a line before it (it would put ERROR in the .ss.py):
positions = [(0,0)]
positions = [(0,0) for i in range(length)]
This would be normally be fine with me, but it gets translated into the C++:
positions = (new list<tuple2<__ss_int, __ss_int> *>(1, (new tuple2<__ss_int,
__ss_int>(2, 0, 0))));
positions = list_comp_0(length);
And that's on the inside of a for loop that could repeat millions of times,
causing a horrendous memory leak (why doesn't it free() it?) and speed
degradation.
If I could somehow actually declare the type of positions I'd have a slightly
faster output and shedskin wouldn't need any templates to figure out
positions's type. I believe this is a sufficient example to prove type
declaration can be useful in any code, mine in particular is only 47 lines.
Original comment by fahh...@gmail.com
on 3 Oct 2010 at 6:06
thanks for the feedback. you have probably run into a bug, because shedskin
shouldn't need any such type declarations.. please consider opening an issue
for your program(s), so I can have a look!
note that in the shedskin example programs, there are about 50 programs, at a
total of over 10,000 lines (sloccount), that work without any form of such type
hints, except for a few lines where it really is unavoidable for type inference
to work (for example, when we build an extension module, we cannot do without a
'fake main').
btw, free() is not necessary for shedskin generated code, because it uses the
Boehm GC, which automatically frees memory when it becomes unreachable.
Original comment by mark.duf...@gmail.com
on 3 Oct 2010 at 8:44
Original comment by mark.duf...@gmail.com
on 5 Oct 2010 at 2:06
It may be that they all work, but a large problem I have is that if shedskin
uses too much memory it starts swapping and my computer dies. I do all my Linux
dev on a desktop with only 512MB RAM which is normally sufficient with Linux
tools, but shedskin, unless I've done a lot of work to limit the templates
available, easily goes to up 50% or more, at which point my computer locks up
and only through plenty of Ctrl-C presses and a few minutes do I get out of
Shedskin. Adding type hints would provide the ability to decrease the time
taken inferring some things.
One simple example is this:
I have a module I've already shedskin'd with a function
LJ_Minimum_Image_forces, and then I replaced the .py file with an
inference-only stub that contains the fewest number of lines I could do:
class F_kls():
def __init__(self, x=[0.0],y=[0.0],z=[0.0]):
self.x=x
self.y=y
self.z=z
def
LJ_Minimum_Image_forces(posx=[0.,0.],posy=[0.,0.],posz=[0.,0.],lcfg=2,r_lcfg=ran
ge(2),L=5.0):
# calculate force at t
Fx=Fy=Fz = [0. for i in r_lcfg]
v_lattice = 0
return (F_kls(Fx,Fy,Fz),v_lattice)
if __name__=='__main__':
F,V = LJ_Minimum_Image_forces([0.,0.],[0.,0.],[0.,0.],2,range(2),5.0)
Fx,Fy,Fz = F.x,F.y,F.z
Shedskin can handle this file as well as the originally, but the moment I call
this function in another file, Shedskin blows up.
If I was able to simply say LJ_Minimum_Image_forces takes
(list(float),list(float),list(float),int,list(int),float) and returns
(F_kls,float), I would be able to compile other modules that use this.
In general, shedskin re-infers modules that are unnecessary. If it could accept
an annotated (like the files it outputs) file, it could also save time by not
re-processing files when shedskin'ing other files.
I have narrowed the problem down to that single call using #{ #} around my code
but I can't tell how much it actually blows up because my hard drive lights up
a constant red for as long as 10 minutes after the first Ctrl-C.
Original comment by fahh...@gmail.com
on 16 Oct 2010 at 4:00
thanks for the feedback. shedskin 0.6 should take _much_ less memory than 0.5
in many cases.. do you still see the problem with 0.6..? if so, could you
please send me a complete program that 0.6 has trouble with? thanks! :)
Original comment by mark.duf...@gmail.com
on 18 Oct 2010 at 2:16
issue 105 is a duplicate of this issue, with a suggestion to use some new
python 3 pep dealing with type declarations.
Original comment by mark.duf...@gmail.com
on 27 Oct 2010 at 1:24
I think we can close this one.. we ended up compiling pylot without almost any
type declarations (uranium), and since 0.6, shedskin takes only a few hundred
MB of memory for even the largest examples (fahhem).. please reopen if you
disagree.
Original comment by mark.duf...@gmail.com
on 27 Feb 2011 at 12:35
Original issue reported on code.google.com by
uran...@gmail.com
on 19 Aug 2010 at 5:01