sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.39k stars 473 forks source link

Speed Up Poset Generation #14110

Open 01f5781a-be80-4f39-b75f-3542e9378080 opened 11 years ago

01f5781a-be80-4f39-b75f-3542e9378080 commented 11 years ago

Currently (sage 5.6), Sage generates the posets on n elements by generating all digraphs with n vertices and checking which of those give posets. A paper of Brinkmann and McKay gives an algorithm for generating posets which they used to generate posets up to 16. Sage uses a related algorithm of McKay to generate the digraphs.

B. D. McKay and G. Brinkmann, Posets on up to 16 points, Order, 19 (2002) 147-179

https://doi.org/10.1023/A:1016543307592

CC: @kevindilks @sagetrac-rowland @sagetrac-ahmorales @jm58660 @jasongrout @nathanncohen @dimpase

Component: combinatorics

Keywords: posets

Reviewer: Thierry Monteil

Issue created by migration from https://trac.sagemath.org/ticket/14110

fchapoton commented 11 years ago

Description changed:

--- 
+++ 
@@ -1 +1,3 @@
 Currently (sage 5.6), Sage generates the posets on n elements by generating all digraphs with n vertices and checking which of those give posets. A paper of Brinkmann and McKay gives an algorithm for generating posets which they used to generate posets up to 16. Sage uses a related algorithm of McKay to generate the digraphs.
+
+B. D. McKay and G. Brinkmann, Posets on up to 16 points, Order, 19 (2002) 147-179
fchapoton commented 11 years ago

Changed keywords from none to posets

f29946bc-ee7b-48cd-9abc-3445948c551d commented 10 years ago
comment:7

Brinkmann was kind and send me C-code that he used. However, on sage-devel list Nathann Cohen said that at least Nauty is not available as GPL and can not be directly added to Sage. As an optional package it should be fine.

To glue C code to Sage or to read the paper and make same code in [c|p]ython?

tscrim commented 10 years ago
comment:8

Since you have (working) specialized C-code, I'd just glue it to Sage using cython (guaranteed to be faster than a pure python implementation). Anyways, that's my opinion.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 10 years ago
comment:9

Replying to @tscrim:

Since you have (working) specialized C-code, I'd just glue it to Sage using cython (guaranteed to be faster than a pure python implementation). Anyways, that's my opinion.

For that I first have to learn how to do glueing... Somewhere in the net should be simple example of making python data structure from C code.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 10 years ago
comment:10

As a quick test: len(Posets(6).list()) took about 10 seconds, reading binary output from Brinkmann's code for posets of size 7 took about 12 seconds. Maybe going to cython will make n=8 to took about same time.

E: About 90% of time is used in making posets from digraphs.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 10 years ago

Attachment: poset_package.zip

File from Brinkmann. Not GPL!

f29946bc-ee7b-48cd-9abc-3445948c551d commented 10 years ago
comment:11

I attached the code I got from Brinkmann. It contains a slightly modified version of nauty; hence copyright is free "- - with the exception of sale for profit or application with nontrivial military significance." So it must be made to an optional package.

It compiles with gcc -O4 -o /tmp/posets posets.c nauty_poset.c nautil_poset.c. Code to use it in non-optimal way seems to be quite easy:

import subprocess
def generate_posets(n):  # Add check for the argument etc.
    cmd=["/tmp/posets"]
    args=[str(n), 'o']
    sp=subprocess.Popen(cmd+args, shell=False, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True)
    reader=iter(lambda: sp.stdout.read(2+2*n), '')
    while True:
        bin=reader.next()[2:]
        L=[]
        for i in range(0, n):
            t=ord(bin[1+2*i])*256+ord(bin[2*i])
            for j in range(0, n):
                if (t << j) & 32768:
                    L.append((i,j))
        G=DiGraph()
        G.add_vertices(range(0, n))
        G.add_edges(L)
        yield Poset(G, facade = True)

There is much to make it even faster: for example to check linear extension and use FinitePoset directly, or even add to static sparse graphs an initializer that takes plain list as argument.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 10 years ago
comment:12

First test of packing files to .spkg is now attached to this ticket. Needs to add license information etc. Also I will do a test file and one line test script to check output.

However, now somebody could test if also 32-bit Linux and Mac OS X are able to install this.

Line cmd=["/tmp/posets"] must be changed to cmd=[sage.env.SAGE_LOCAL+"/bin/genposets"].

f29946bc-ee7b-48cd-9abc-3445948c551d commented 10 years ago
comment:13

What should be the interface for this? If the package is installed, should Posets(n) use it automatically? What if we later got algorithm for generating for example all lattices or all modular lattices? Maybe this could be something like

Posets.PosetGenerator(n, properties='all')

And then later we could have something like properties='lattice'. Same arguments could be available also on Posets.RandomPoset(). Plain Posets(n) could be modified to use this, if the package is available.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago
comment:14

I asked from Brendan McKay "Nauty.h contains copyright with 'exception of sale for profit or application with nontrivial military significance.' Does this apply to whole package?" and he answered: "It applies to all of the nauty files, but for incorporation into Sage you can ignore it.

Another thing, somebody using Mac should check if this compiles. And answer my last question.

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:15

Note that nauty is already an optional Sage package, in old-spkg-style format : http://sagemath.org/packages/optional/

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago
comment:16

Replying to @sagetrac-tmonteil:

Note that nauty is already an optional Sage package

True, but I think that it does not contain a code for generating posets.

Or do you mean to append this poset generation to nauty package?

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:17

Note that nauty is already an optional Sage package

True, but I think that it does not contain a code for generating posets.

Indeed. Do you think is is possible to rely on the existing nauty package for the nauty-part, and only add the poset-specific stuff ? Maybe it does not worth the work if the nauty part is too modified and is a small part of posetgen.

Also, do you think upstream could officially distribute its poset code (within nauty or in a separate tarball) ?

Or do you mean to append this poset generation to nauty package?

Definitely not, i think we should keep upstream tarball's integrity as much as possible (in particular moving nauty to the new spkg layout could be nice).

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago
comment:18

Replying to @sagetrac-tmonteil:

Do you think is is possible to rely on the existing nauty package for the nauty-part, and only add the poset-specific stuff ? Maybe it does not worth the work if the nauty part is too modified and is a small part of posetgen.

nauty_poset.c starts with "modified by Gunnar Brinkmann for use with posets." Hence I guess it is inpractical to merge.

Whole code for poset generation is about 7000 lines, so there is not much space to save.

Also, do you think upstream could officially distribute its poset code (within nauty or in a separate tarball) ?

I haven't asked that. But if they are not going to maintain poset code, then what difference would that do?

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:19

Replying to @jm58660:

nauty_poset.c starts with "modified by Gunnar Brinkmann for use with posets." Hence I guess it is inpractical to merge.

Whole code for poset generation is about 7000 lines, so there is not much space to save.

OK, i was just asking to avoid code duplication (especially if was have to patch it).

Also, do you think upstream could officially distribute its poset code (within nauty or in a separate tarball) ?

I haven't asked that. But if they are not going to maintain poset code, then what difference would that do?

Upstream will be that trac ticket. And each modified version (e.g. to compile on particular architecture) will become the next upstream without clear distinction.

Having a well identified upstream source whose checksum can be checked by anyone is preferable. If i come with a modified code that is attributed to someone else, there is no way to check my modifications (disclaimer: i am not insinuating anything with respect to the current package !). The new layout upstream-unmodified-checkable-source + downstream-readable-patches makes things more clear.

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago

Reviewer: Thierry Monteil

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:20

Replying to @jm58660:

However, now somebody could test if also 32-bit Linux and Mac OS X are able to install this.

This works for me (in the sense : i can compile and run the program, i did not check the results) on Linux 32 bits (Debian wheezy).

However, i can not run the tests because of the #!/usr/bin/sh at the beginning of spkg-check (note that you used #!/bin/sh in spkg-install). Perhaps could you use #!/usr/bin/env sh instead.

Also, it is unlikely this test will work since spkg-install creates a binary named posetgen and spkg-check calls a binary named genposets.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago

Attachment: posetgen.spkg.gz

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago
comment:21

I changed package file. Hope it works now.

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 9 years ago
comment:22

Just an update on [comment:15]: since then, nauty has become a new-style package (#18425).

jdemeyer commented 9 years ago
comment:23

Replying to @jm58660:

I asked from Brendan McKay "Nauty.h contains copyright with 'exception of sale for profit or application with nontrivial military significance.' Does this apply to whole package?" and he answered: "It applies to all of the nauty files, but for incorporation into Sage you can ignore it.

Any idea what "It applies to all of the nauty files, but for incorporation into Sage you can ignore it" actually means? The problem is that it really needs to be GPL v3 to be standard in Sage. And once it's GPL, it's GPL for everybody, not just for Sage.

The "sale for profit" isn't an unlikely scenario given William Stein's plans to make money from SMC.

Final remark: it's allowed by the GPL to run Sage and nauty as separate programs (using Python's subprocess for example), but not to link nauty as a library.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago
comment:24

Replying to @jdemeyer:

Any idea what "It applies to all of the nauty files, but for incorporation into Sage you can ignore it" actually means? The problem is that it really needs to be GPL v3 to be standard in Sage. And once it's GPL, it's GPL for everybody, not just for Sage.

The "sale for profit" isn't an unlikely scenario given William Stein's plans to make money from SMC.

And even restriction to non-military applications is a problem. Who can be sure that there will never be any military application for posets?

Do we have someone who is better to discuss with McKay about these license issues?

jdemeyer commented 9 years ago
comment:25

Replying to @jm58660:

I asked from Brendan McKay "Nauty.h contains copyright with 'exception of sale for profit or application with nontrivial military significance.' Does this apply to whole package?" and he answered: "It applies to all of the nauty files, but for incorporation into Sage you can ignore it.

Can you give the complete contents of this email exchange with McKay?

jdemeyer commented 9 years ago
comment:26

Replying to @jm58660:

Do we have someone who is better to discuss with McKay about these license issues?

There is not really that much to discuss. Either McKay wants to relicense nauty under GPL v3 or he does not want that. Only in the former case can we include it as standard Sage package.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago
comment:27

Replying to @jdemeyer:

Replying to @jm58660:

I asked from Brendan McKay "Nauty.h contains copyright with 'exception of sale for profit or application with nontrivial military significance.' Does this apply to whole package?" and he answered: "It applies to all of the nauty files, but for incorporation into Sage you can ignore it.

Can you give the complete contents of this email exchange with McKay?

I already gave... (Except a trivial ping-message and a start "Sorry, I forgot to answer.")

jdemeyer commented 9 years ago
comment:28

The sentence "It applies to all of the nauty files, but for incorporation into Sage you can ignore it." really is the whole answer? :-)

Given that he is OK with Nauty being in Sage and given that people might use Sage commercially or for military applications, this should logically mean that he would agree to license Nauty under the GPL v3. Somebody just needs to ask...

f29946bc-ee7b-48cd-9abc-3445948c551d commented 9 years ago
comment:29

Replying to @jdemeyer:

The sentence "It applies to all of the nauty files, but for incorporation into Sage you can ignore it." really is the whole answer? :-)

Yes.

Given that he is OK with Nauty being in Sage and given that people might use Sage commercially or for military applications, this should logically mean that he would agree to license Nauty under the GPL v3. Somebody just needs to ask...

And what is more: Sage is free, so I can take any part of Sage and use it as a part of some other (free) program that is used to prepare against another shelling of Mainila. Free software can be splitted, not only combined.

I am quite sure that he did not think about that. Not in a way or another.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 8 years ago
comment:30

About license: Dima has asked McKay, who said that it is ok for him, IF Brinkmann do not complain. So we still wait to see if this can be added as a standard package.

dimpase commented 8 years ago
comment:31

Replying to @jm58660:

About license: Dima has asked McKay, who said that it is ok for him, IF Brinkmann do not complain. So we still wait to see if this can be added as a standard package.

Jeroen, isn't Brinkmann your colleague at UGent? Could you perhaps buy him a biertje so that he releases his related code under GPL, too?

jdemeyer commented 8 years ago
comment:32

Replying to @dimpase:

Replying to @jm58660:

About license: Dima has asked McKay, who said that it is ok for him, IF Brinkmann do not complain. So we still wait to see if this can be added as a standard package.

Jeroen, isn't Brinkmann your colleague at UGent?

Well, he's in a different department, but I could ask him, yes.

dimpase commented 8 years ago
comment:33

Replying to @jdemeyer:

Replying to @dimpase:

Replying to @jm58660:

About license: Dima has asked McKay, who said that it is ok for him, IF Brinkmann do not complain. So we still wait to see if this can be added as a standard package.

Jeroen, isn't Brinkmann your colleague at UGent?

Well, he's in a different department, but I could ask him, yes.

I forwarded you my latest email to him.

f29946bc-ee7b-48cd-9abc-3445948c551d commented 8 years ago
comment:34

We can proceed to make posetgen a standard package, see https://groups.google.com/d/msg/sage-devel/VuNMvS9jyH8/fcUhKmraBAAJ

fchapoton commented 3 years ago

Description changed:

--- 
+++ 
@@ -1,3 +1,5 @@
 Currently (sage 5.6), Sage generates the posets on n elements by generating all digraphs with n vertices and checking which of those give posets. A paper of Brinkmann and McKay gives an algorithm for generating posets which they used to generate posets up to 16. Sage uses a related algorithm of McKay to generate the digraphs.

 B. D. McKay and G. Brinkmann, Posets on up to 16 points, Order, 19 (2002) 147-179
+
+https://doi.org/10.1023/A:1016543307592
slel commented 3 years ago
comment:36

Recently active topic on Math Overflow.