Closed malb closed 9 years ago
Hello,
seems to be done in sympy
(which is shipped with Sage)
sage: from sympy.logic.utilities.dimacs import load
sage: load('1 2 \n 3')
And(Or(cnf_1, cnf_2), cnf_3)
But I do not know how sympy
fits with the current SAT stuff in Sage.
Vincent
it's very easy to load/parse: just turn each line into a tuple and feed those to the SatSolver class with add_clause. Something like
for line in dimacs_file:
if line.startswith("c"):
continue # comment
if line.startswith("p"):
continue # header
line = line.split(" "))
clause = map(int,[e for e in line if e])
clause = clause[:-1] # strip trailing zero
solver.add_clause(clause)
This is perhaps off-topic. I have read the dimacs.py
file and i am surprised that the class maintains the data as an opened file (see self._tail
). I am thinking of the case, where the data is a string given by propcalc.formula("a|b").satformat()
, it looks weird to me that the class has to create a file on the disk for that. Is there a reason for that design choice that i could be missing ? Why not maintaining as a list of tuples of integers (corresponding to a list of clauses), and opening a file only when writing and loading data to/from a file ?
Hi, the idea was to avoid having the clauses in memory twice: once in the solver and once (in a bloated format) in Python. The rationale behind this was that in my applications these SAT clauses can be rather big (think 1-2GB).
Author: Travis Scholl
Branch: u/tscholl2/read_dimacs_16924
I added a read
method as described in the comments here. I split it into two methods, a read
method with the usual read(filename)
signature and also a load(file_object)
method which actually reads the lines from the file. This way it's easy to use other kinds of buffers besides files. Also it allows easy testing using fake files from StringIO. The import uses the six
package so the example is compatible with both python 2 and 3. It should read line by line so it shouldn't store the entire files in memory as it sounds like people use somewhat larger files for this.
New commits:
d9274c6 | added satsolver method to read in DIMACS files |
I think file objects are a rare thing in the Sage "API", i.e. as far as I know we don't usually use them. I'm not sure there should be two methods, just one with a filename. Also, aren't "read" and "load" a bit too generic as names given that these are methods on solver instances?
Thanks for the comment.
I wasn't sure about the kind of use case this class gets. You mentioned in a comment that you sometimes use files on the order of gigabytes. I thought it might be useful to have easy access to a method which you only need a stream in case you don't have a disk file (for example a memory mapped file object, a StringIO file object, or some kind of stream from stdin). I don't know if you can open those with fopen
.
However, if you think that most people just read from a file on the disk, then I could easily remove the load
function.
As for the read
name, I used read
because there was already a write
method and it seemed complementary. Do you think parse
is a better name?
You're right read
makes sense given that there is a write
. How about both functions are merged into one? If the parameter is a string it is considered a file name? Otherwise it is considered a file object?
Branch pushed to git repo; I updated commit sha1. New commits:
623cbed | merged read and load methods |
That's a great idea. I merged the methods and uploaded the new version.
Looks good to me.
Reviewer: Martin Albrecht
Docs don't build
Branch pushed to git repo; I updated commit sha1. New commits:
79c7c68 | added an 'r' |
Added an "r" so the doc string is raw text instead of a string so that the "\n"'s in the example don't crash the doc building.
Hello,
The docstring of the function should follow the scheme
def my_function(x,y,z):
r"""
one line description
longer description if needed
EXAMPLES::
sage: my_function(1,2,3)
The following is not helpful in the documentation
See https://github.com/sagemath/sage-prod/issues/16924
It would make sense in an AUTHOR
section. Or within the TESTS
section if it would correspond to some bug fix. Moreover, the syntax is now trac:`16924`
.
In your example, I do not understand what you want to show.
sage: solver.read(file_object)
sage: solver
DIMACS Solver: ''
It seems that nothing happened... Would it be possible to print the clauses?
Vincent
Branch pushed to git repo; I updated commit sha1. New commits:
34d8f00 | fixed docstring and example |
Thanks for the comments!
I changed the doc string as you suggested and changed the example to print out the clauses. It's a lot better for the example.
Please let me know if you notice anything else that doesn't fit in.
I hardly found specification for DIMACS files. In http://people.sc.fsu.edu/~jburkardt/data/cnf/cnf.html they say several things that you are not taking care of:
p cnf NUM_VAR NUM_CLAUSES
(the same format is used for other constraint satisfcation problems)In other words, as far as I understand, this would be valid
c hey
c ho
p cnf 3 2
1 -2 0 3
1 0
Moreover, it is written In some examples of CNF files, the rule that the variables are numbered from 1 to N is not followed. The file might declare that there are 10 variables, for instance, but allow them to be numbered 2 through 11.
It would be nice if you know some link with the exact specification of the format.
Replying to @videlec:
- the comments should only be at the begining
Why enforce that restriction when there is no harm (as far as I can see) in being more liberal?
- there is one line starting with
p cnf NUM_VAR NUM_CLAUSES
(the same format is used for other constraint satisfcation problems)
As far as I know, many parsers these days ignore this (source: I chatted to Máté Soos ages ago, I might be misremembering, though)
- a clause might extend on more than one line and a clause might start on the same line as another one
I did not know about that one.
In other words, as far as I understand, this would be valid
c hey c ho p cnf 3 2 1 -2 0 3 1 0
Replying to @malb:
Replying to @videlec:
I very naively read the web page I mentioned. I am fine with the current state, but I just wanted to be sure that anybody (that uses this format) agrees with this parser. And of course, I an not the required person to judge that.
One option would be to write down in the docstring that the DIMACS format is not well specified.
Vincent
Sorry it took me a while, I'm at a Sage days working on other things. But I asked people here and got directed to a SAT solver competition website http://people.sc.fsu.edu/~jburkardt/data/cnf/cnf.html.
I copied their rules here
The clause about "ending with 0 on the same line" means they don't allow the multi-line clauses. So it sounds like the definition of "DIMACS standards" depends on who you ask.
I will modify the docstring to clearly describe the variation it assumes the file is in.
Should I post links to these pages describing different rules for DIMACS files in the docstring as well?
EDIT: Sorry I realized I copied the wrong link. Here is the corect one: http://www.satcompetition.org/2009/format-benchmarks2009.html
Hi, might as well thrown in the links. Sounds like a good plan.
If, instead of reimplementing it yourself, you prefer to sweat over GLPK's API, then you can try calling "glp_read_cnfsat" [1]. That's a way to delegate the desponsibility of handling all cases :-P
Nathann
Branch pushed to git repo; I updated commit sha1. New commits:
169849b | noted differences in DIMACS file specs in docs |
I added some lines to the documentation for the read method which explains the differences in DIMACS files with links to various "standards". I rebuilt the docs and everything seemed to be formatted alright.
I thought the method Martin wrote is flexible and simple enough that it wasn't worth it to call (and sweat over) GLPK's definition of DIMAC files. Thank you for pointing out another source though Nathann!
Looks good to me.
Changed branch from u/tscholl2/read_dimacs_16924 to 169849b
Sage writes dimacs files (for SAT solvers) but doesn't seem to read them.
CC: @nathanncohen @sagetrac-tmonteil
Component: interfaces: optional
Author: Travis Scholl
Branch/Commit:
169849b
Reviewer: Martin Albrecht
Issue created by migration from https://trac.sagemath.org/ticket/16924