sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.41k stars 475 forks source link

Improvements to SetPartitions #15143

Closed tscrim closed 11 years ago

tscrim commented 11 years ago

This ticket does the following:

Apply: attachment: trac_15143-set_partitions-folded-ts.2.patch

Depends on #14234

CC: @sagetrac-sage-combinat @zabrocki

Component: combinatorics

Author: Travis Scrimshaw

Reviewer: Darij Grinberg, Mike Zabrocki

Merged: sage-5.13.beta0

Issue created by migration from https://trac.sagemath.org/ticket/15143

darijgr commented 11 years ago
comment:2

Here is a started review. Before I continue, I'd like you to address some questions:

tscrim commented 11 years ago
comment:3

Thanks for looking at this patch.

Replying to @darijgr:

Here is a started review. Before I continue, I'd like you to address some questions:

  • Your __mul__ method gives the same result as the existing inf method, but seems to be faster at arriving there. Better to reimplement the latter as an alias for the former? If you do so, the docstrings should probably be merged.

I'll take a look at it (although both of these methods were prior implementations).

  • I don't understand the docstring of ordered_set_partition_action, nor do I understand the definition given in [LM2011] (but I have not really tried). Maybe you can improve it still? What are A_{i_1}, A_{i_2} etc.?

I'll expand on the docstring.

  • I have replaced the docstring and the implementation of is_atomic to something that matches the definition of "atomic" in the context of multiplicative generators of NCSym. I hope this is what you really meant. First, the empty set partition is no longer atomic; second, [[1, 4], [2], [3]] is now atomic despite max(2) < min(3) because there is a max(1, 4) to spoil the game.

I was worried I oversimplified it. Thanks for fixing it.

Let's see if we can close a small can of worms. Well, since 1 usually isn't considered as a generator of a unital algebra, it's okay that the empty set partition is not atomic. I'm assuming that similarly 1 is not considered a primitive of a Hopf algebra. However by the definition directly, it would be considered atomic since there is no non-trivial split, but I'll change it to state explicitly we don't consider it atomic. This is similar to how we don't consider 1 as being prime.

  • I have removed the part about set compositions in the docstring of __mul__ because the method only appears on the SetPartition class.

  • I see you have changed the parent of all set partitions to not include the base set. I am not saying this is wrong; just letting you know that I cannot fully test the consequences of this change and so my review will be somewhat incomplete even when it is finished.

I made sure to put in some tests for things like equality and for the is_less_than and is_strict_refinement methods when comparing objects with different parents. Doing it this way will net a (very) small speedup in creation since we don't have to find the right parent by taking the union of sets (I didn't measure). It also made me feel more secure in my implementation of NCSym.

I'll post the new version of the patch tomorrow.

darijgr commented 11 years ago
comment:4

Oops, I didn't realize __mul__ was there before you. Anyway, I'll take a look at it if you don't; it sounds like a cheap speedup one can get from just throwing away old code.

The Bergeron-Zabrocki paper you linked me to (thank you!) says "A is called atomic if it is non-empty and not splittable"; note the "non-empty" here. But apparently some other sources don't require non-emptiness. As far as I am concerned, the "non-empty" convention makes more sense; these are supposed to be a free monoid generating set of the monoid of all set partitions, so there is no reason the unity of the monoid should be there.

d4d9e38a-6e64-40d7-a7f7-bd828eb9e0db commented 11 years ago
comment:6

I started testing parts of this patch and found a minor bug that the partition type isn't checked.

sage: SetPartition([[1,2],[4]]) in SetPartitions([1,2,4],[3])
True

The other parts that I did test are looking good.

d4d9e38a-6e64-40d7-a7f7-bd828eb9e0db commented 11 years ago
comment:7

Also, I've found that SetPartition accepts the following bad input. Maybe it should raise an error.

sage: SetPartition([[1,2],[2,3]])
{{1, 2}, {2, 3}}
darijgr commented 11 years ago
comment:8

I understand this as a tradeoff with speed -- maybe add a check variable which is set to True by default, and go over all existing code that involves set partitions (hopefully not that much, and some of it will have to be modified anyway since #15143 breaks some doctests by changing output order) to make all calls to SetPartition call it with False?

Partition algebras are HUGE; I definitely wouldn't want to see them slowed down by input sanitizing.

tscrim commented 11 years ago
comment:9

Attachment: trac_15143-set_partitions-ts.patch.gz

Here's the new version with better containment checks and doctest corrects to partition algebras.

For patchbot:

Apply: trac_15143-set_partitions-ts.patch

tscrim commented 11 years ago

Dependencies: #14234

darijgr commented 11 years ago
comment:10

Thanks for the fixes. In other news, I am now sufficiently convinced of the superiority of __mul__ that I'm replacing inf by an alias to __mul__:

sage: %timeit SetPartition([[1,3],[2,6,5]]).__mul__(SetPartition([[1,2],[3],[6]]))                                               
1000 loops, best of 3: 1.61 ms per loop
sage: %timeit SetPartition([[1,3],[2,6,5]]).inf(SetPartition([[1,2],[3],[6]]))    
100 loops, best of 3: 1.96 ms per loop
sage: %timeit SetPartition([[1],[2,3]]).__mul__(SetPartition([[1,2],[3]]))        
1000 loops, best of 3: 1.3 ms per loop
sage: %timeit SetPartition([[1],[2,3]]).inf(SetPartition([[1,2],[3]]))    
1000 loops, best of 3: 1.52 ms per loop
sage: %timeit SetPartition([[1],[2],[3]]).__mul__(SetPartition([[1],[2],[3]]))
1000 loops, best of 3: 1.92 ms per loop
sage: %timeit SetPartition([[1],[2],[3]]).inf(SetPartition([[1],[2],[3]]))    
100 loops, best of 3: 2.37 ms per loop
sage: %timeit SetPartition([[1],[2],[3],[4],[5],[6],[7]]).__mul__(SetPartition([[1],[2],[4],[5],[8],[9],[3]]))
100 loops, best of 3: 5.61 ms per loop
sage: %timeit SetPartition([[1],[2],[3],[4],[5],[6],[7]]).inf(SetPartition([[1],[2],[4],[5],[8],[9],[3]]))    
100 loops, best of 3: 9.28 ms per loop
sage: %timeit SetPartition([[1,3,37]]).__mul__(SetPartition([[13,3,7]]))                                      
1000 loops, best of 3: 662 us per loop
sage: %timeit SetPartition([[1,3,37]]).inf(SetPartition([[13,3,7]]))    
1000 loops, best of 3: 744 us per loop

I think the check variable does nothing when one generates a set partition. I'd be happy if you could see if there is something gained by omitting the check, and if so, making the variable work. But I must admit you have sped up the generation of a set partition by a factor of about 2 (compared to pre-#15143) already, so the net effect is a positive one even if the check is left in.

I still don't understand the doc of ordered_set_partition_action. What is unclear to me is this:

        `A_{ \{ i_1, i_2, \ldots, i_k \} }` denote the sub-partition
        `\{A_{i_1}, A_{i_2}, \ldots, A_{i_k}\}`.

Other than this method, I'll call this review done. Unfortunately, I'm out of leeway now as far as free time is concerned, and someone else will likely have to review the NCSym patch.

darijgr commented 11 years ago

review patch for the new version

d4d9e38a-6e64-40d7-a7f7-bd828eb9e0db commented 11 years ago

Attachment: trac_15143-rev1-dg.patch.gz

Attachment: trac_15143-docchanges-mz.patch.gz

d4d9e38a-6e64-40d7-a7f7-bd828eb9e0db commented 11 years ago
comment:11

I have some additional doc changes.

I have one additional recommendation. I think you should use the name size instead of weight. There was no method before this patch for computing the size of a set partition, but it could be done indirectly by A.to_partition().size() or A.to_permutation().size(). So why call the "size" in this combinatorial object the "weight" of a set partition?

If you are going to call anything the "weight" of the set partition I would think it would be the corresponding integer partition (which you called the "shape").

tscrim commented 11 years ago
comment:12

Hey Darij,

Hopefully this example will clarify, consider A = {{1, 4}, {3}, {2,5}}, then A_{1,3} = {{1, 4}, {2, 5}} and A_{1} = {{1, 4}} (I've omiting the \ and subscript {} for readability). The check is currently extraneous, but I'll pass it along to the constructor.

Hey Mike,

Alright, and I'll add some more documentation specifying that we are using the terminology for partitions for size and length.

I'll post a new version later tonight.

Thanks,

Travis

darijgr commented 11 years ago

rest of the review. apply after my rev1 and mike's doc patch

darijgr commented 11 years ago
comment:13

Attachment: trac_15143-rev2-dg.patch.gz

Hi Travis,

thanks for the example; I didn't expect that A_i would really mean the i-th part of A (with A being unordered, this sounded a bit arbitrary to me). I think I've made the doc for this method a lot more readably now (there was some index confusion with both A and s having the same length q, and a condition r < |S| which I think is completely out of place). Please also look at the pipe method I've added (mainly because its docstring clarifies the notation used in other methods). If you agree with Mike's and my changes, this is a positive_review.

Best regards, Darij

tscrim commented 11 years ago

Reviewer: Darij Grinberg, Mike Zabrocki

tscrim commented 11 years ago
comment:14

Hey Darij and Mike,

Here's a patch with all of the changes folded in, along with:

If you're happy with my tweaks, then it's positive_review time.

Thanks,

Travis

tscrim commented 11 years ago

Description changed:

--- 
+++ 
@@ -6,3 +6,4 @@
 - Implements a fully consistent string representation.
 - Adds multiple methods, including accessing the base set for the set partition.

+Apply: [attachment: trac_15143-set_partitions-folded-ts.patch](https://github.com/sagemath/sage-prod/files/10658388/trac_15143-set_partitions-folded-ts.patch.gz)
tscrim commented 11 years ago
comment:15

For patchbot:

Apply: trac_15143-set_partitions-folded-ts.patch

d4d9e38a-6e64-40d7-a7f7-bd828eb9e0db commented 11 years ago
comment:16

I think that this looks great and I am ready to put a positive review but I am getting a doc test failure "File "combinat/diagram_algebras.py", line 193"

darijgr commented 11 years ago
comment:17

Hi Travis,

You don't want s to be an "ordered set partition (i.e., set composition) of [r] where r\leq k"; you want it to be an ordered set partition of a SUBSET of [k] (which may and may not be an initial interval). See this doctest:

t = OrderedSetPartition([[2], [3,4]])

where it is not an initial interval. Generally, your r's should be k's.

I'm fine with the rest of your changes. Once you've fixed the ordered set partition action docstring, you can set this to positive_review as far as I am concerned. Sorry that I can't help with zabrocki's doctest failure as I'm not near sage right now.

Best regards (also thanks for the ordered set partition review), Darij

tscrim commented 11 years ago
comment:18

Attachment: trac_15143-set_partitions-folded-ts.patch.gz

Hey Darij,

Yes, good point. Fixed.

Hey Mike,

It seems like a failure due to dictionary display. I'm returning a sorted list, so it should be fixed now.

d4d9e38a-6e64-40d7-a7f7-bd828eb9e0db commented 11 years ago
comment:19

Looks great to me and all tests pass.

tscrim commented 11 years ago
comment:20

Thank you both!

fchapoton commented 11 years ago
comment:21

for patchbots

apply trac_15143-set_partitions-folded-ts.patch

jdemeyer commented 11 years ago
comment:23

Please add a proper commit message. In particular, imported patch trac_15143-docchanges-mz.patch shouldn't be there.

jdemeyer commented 11 years ago
comment:24

Also, the pdf documentation doesn't build:

! LaTeX Error: Bad math environment delimiter.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...

l.48086 where $s = \(
                     s_1, s_2, \ldots, s_q\)$. Here, the pipe symbol
darijgr commented 11 years ago

Attachment: trac_15143-set_partitions-folded-ts.2.patch.gz

docstring fixed, and with proper commit msg

darijgr commented 11 years ago
comment:25

Fixed!

for the patchbot:

apply trac_15143-set_partitions-folded-ts.2.patch

darijgr commented 11 years ago

Description changed:

--- 
+++ 
@@ -6,4 +6,4 @@
 - Implements a fully consistent string representation.
 - Adds multiple methods, including accessing the base set for the set partition.

-Apply: [attachment: trac_15143-set_partitions-folded-ts.patch](https://github.com/sagemath/sage-prod/files/10658388/trac_15143-set_partitions-folded-ts.patch.gz)
+Apply: [attachment: trac_15143-set_partitions-folded-ts.2.patch](https://github.com/sagemath/sage-prod/files/10658389/trac_15143-set_partitions-folded-ts.2.patch.gz)
tscrim commented 11 years ago
comment:27

Thanks Darij.

jdemeyer commented 11 years ago

Merged: sage-5.13.beta0