poppopjmp / shedskin

Automatically exported from code.google.com/p/shedskin
0 stars 0 forks source link

new set features in 2.6/2.7 #138

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
isdisjoint method support for set

Original issue reported on code.google.com by jason.mi...@gmail.com on 15 Jun 2011 at 8:23

GoogleCodeExporter commented 8 years ago
thanks a lot for mentioning! also added as an easy task for now.

in general, we may want to check if there are any other new things in 2.6/2.7 
that are currently unsupported, for example by looking here:

http://docs.python.org/library/stdtypes.html

Original comment by mark.duf...@gmail.com on 15 Jun 2011 at 9:58

GoogleCodeExporter commented 8 years ago
patch attached.

Original comment by bpederse on 23 Jun 2011 at 9:06

Attachments:

GoogleCodeExporter commented 8 years ago
as mentioned in 137, this one is pushed. leaving the issue open for a bit 
longer, because I'd like to check if there's no other new set functionality we 
haven't implemented yet.

Original comment by mark.duf...@gmail.com on 23 Jun 2011 at 9:27

GoogleCodeExporter commented 8 years ago
COOL
Thanks all

Original comment by jason.mi...@gmail.com on 24 Jun 2011 at 2:21

GoogleCodeExporter commented 8 years ago
on quick glance, the only other new feature seems to be that methods like 
'update' now accept multiple iteratable arguments..

Original comment by mark.duf...@gmail.com on 24 Jun 2011 at 9:25

GoogleCodeExporter commented 8 years ago

Original comment by mark.duf...@gmail.com on 24 Jun 2011 at 9:25

GoogleCodeExporter commented 8 years ago
This patch isn't a new set feature, but here I have added the ability to do 
set.intersect(pyiter) and set.difference(pyiter) which was not possible before.
there are some other methods that only support a set as the argument where 
python accepts iterables.

Instead of doing: 
    return intersection(new set<T>(s))
in :intersection(pyiter<T> *s), I used a for loop to avoid creating a set. I 
didn't benchmark, but if you have some sense that just creating a set is 
faster, please use that (commented out line) instead.

Original comment by bpederse on 24 Jun 2011 at 3:03

Attachments:

GoogleCodeExporter commented 8 years ago
thanks! :) I committed the patch, but changed the pyiter<T> * arguments to more 
general U * arguments. this way, FOR_IN works much faster, because it doesn't 
have to use the general python iteration protocol (__iter__(), next(), next().. 
StopIteration), and is specialized by the C++ compiler for whatever type of 
argument is passed to the method (for a list that means a simple counter, i=0, 
i++, i++.. :-)).

Original comment by mark.duf...@gmail.com on 25 Jun 2011 at 3:44

GoogleCodeExporter commented 8 years ago

Original comment by mark.duf...@gmail.com on 8 Jul 2011 at 9:24

GoogleCodeExporter commented 8 years ago
as a test, I just added support for two args to set.update:

https://gitorious.org/shedskin/mainline/commit/cc664dfddc4a0dc47118fe379f3b54263
895d114

this should be easily generalized to other methods and fixed small numbers of 
arguments..

arbitrary numbers of arguments cannot really be handled nicely I'm afraid. I 
think this can best wait until we switch to variadic templates in c++0x. 

Original comment by mark.duf...@gmail.com on 12 Jul 2011 at 10:07

GoogleCodeExporter commented 8 years ago
brent, are you interested/do you have time to work on this further for 0.9..?

I'm thinking about using variadic templates in 0.9, when the installed GCC 
supports this. this is useful in other areas as well (printing, list/tuple/etc 
constructors..).

Original comment by mark.duf...@gmail.com on 30 Jul 2011 at 12:27

GoogleCodeExporter commented 8 years ago
sure, i can hack on it a bit today. 
you mean to change more methods to use the schema you mention in comment 8?
which ones would be good to look at?

Original comment by bpederse on 30 Jul 2011 at 12:57

GoogleCodeExporter commented 8 years ago
it looks like: set([2, 3]).intersection([2, 3])
is not supported even though:

template<class T> template <class U> set<T> *set<T>::intersection(U *iter) {

is present. I can't parse that line, what types does it accept?
Error:

 error: no matching function for call to ‘__shedskin__::set<int>::intersection(__shedskin__::list<int>*)’
/usr/local/lib/python2.6/dist-packages/shedskin/lib/builtin.hpp:3626: note: 
candidates are: __shedskin__::set<T>* 
__shedskin__::set<T>::intersection(__shedskin__::set<T>*) [with T = int]
^Cmake: *** [settime] Interrupt

Original comment by bpederse on 30 Jul 2011 at 1:34

GoogleCodeExporter commented 8 years ago
I guess there are quite a few methods that now accept multiple arguments.. 
please have a look at the set documentation to see which ones they are.

yeah, using this schema, or using variadic templates perhaps. though support 
for up to 3 arguments would be very nice already.

the problem you mention was present in 0.8, but not in GIT.. :-)

when one adds a 'template<class X>' to a function/class in C++, the compiler 
generates separate versions for each actual type that is 'put into it'. so 
suppose we call 'intersection' with a list in one place, and with a set in 
another place, the C++ compiler will generate two separate versions of 
'intersection'. so to anwser your question, any type :-)

Original comment by mark.duf...@gmail.com on 30 Jul 2011 at 2:06

GoogleCodeExporter commented 8 years ago
oops, was getting an older version on the system path.
I added 3 args to update and started 2 for intersection, but can't see why it's 
not being used.
Diff attached.

Original comment by bpederse on 30 Jul 2011 at 5:20

Attachments:

GoogleCodeExporter commented 8 years ago
looks good! you will also have to change lib/builtin.py, to 'model' the extra 
args. for update this now looks as follows:

    def update(self, *b):
        self.unit = iter(b).next()

because of the '*', shedskin understands that 'update' acccepts multiple args, 
and makes the type of 'b' the combination of the types of all arguments. so 
iter(b).next() has the types of all the elements of all the args. 

the 'unit' var is used to model the type of the contents of the set. but you 
don't have to worry about this too much, adding a star should usually be enough.

Original comment by mark.duf...@gmail.com on 30 Jul 2011 at 5:31

GoogleCodeExporter commented 8 years ago
patch committed, thanks! ;)

https://gitorious.org/shedskin/mainline/commit/94a976a450f3eff0866acce439f745c8d
e07da97

Original comment by mark.duf...@gmail.com on 1 Aug 2011 at 3:57

GoogleCodeExporter commented 8 years ago
are you interested in adding a bit more.. :-)

Original comment by mark.duf...@gmail.com on 6 Aug 2011 at 11:22

GoogleCodeExporter commented 8 years ago
do you mean more of the 2 or 3-ary methods? i can have a look tomorrow. i think 
those are rarely used though...

Original comment by bpederse on 7 Aug 2011 at 8:49

GoogleCodeExporter commented 8 years ago
yeah.. unless things get really obscure, if it's simple to implement support 
for these, I guess why not.. :-)

Original comment by mark.duf...@gmail.com on 7 Aug 2011 at 8:56

GoogleCodeExporter commented 8 years ago
you'd be surprised how many programs do 1 little rare thing.. :-)

just checking, it looks like these all accept multiple args since 2.6:

update
union
intersection
intersection_update
difference
difference_update

so, it looks like there are only 4 methods left.. :) 

Original comment by mark.duf...@gmail.com on 13 Aug 2011 at 3:25

GoogleCodeExporter commented 8 years ago
it is getting a bit close to 0.9, so I'm planning to implement the rest this 
weekend.. ;-) let me know if you have time and interest to do it within the 
next few days instead..

Original comment by mark.duf...@gmail.com on 26 Aug 2011 at 7:55

GoogleCodeExporter commented 8 years ago
here's one iteration, I'll add the 3 arg case for difference_update. if I send 
anything more, it'll be by late saturday. after that, you're on your own. :)

I added some __eq__ stuff for testing. 
I am using this for testing:

    a = set(range(4))
    a.difference_update(range(2), range(3))
    assert a == set([3])
    #assert a == [3]

why does the last line fail even though I implement __eq__(pyiter *) ?

it's also unclear to me why I have to add the unused int to the signatures, 
that may have broken something else?

Original comment by bpederse on 26 Aug 2011 at 9:31

Attachments:

GoogleCodeExporter commented 8 years ago
thanks! I committed your patch, but without the __eq__ additions.. it doesn't 
really make sense to compare sets with other types of objects I think, since it 
will always result in False.

the unused ints look correct. we need them, because at code generation time we 
don't know how the variable number of arguments is handled, so we have to add a 
counter, even if it's not used right now.

Original comment by mark.duf...@gmail.com on 27 Aug 2011 at 12:10

GoogleCodeExporter commented 8 years ago
I believe all of these are added (using the most naive approach) for up to 3 
args in this patch. 
Note that I commented out the ss_union with signature:

    template<class T> set<T> *set<T>::__ss_union(int, pyiter<T> *s)

in favor of:

    template<class T> template<class U> set<T> *set<T>::__ss_union(int, U *other)

I'm not sure of the necessity of both, so wanted to point it out.

Original comment by bpederse on 27 Aug 2011 at 1:50

Attachments:

GoogleCodeExporter commented 8 years ago
thanks a lot, pushed! :-) 

yeah, the templated version of 'union' seems better!

closing this issue..

Original comment by mark.duf...@gmail.com on 27 Aug 2011 at 4:09