Error when reading a CNF formula from file

ElisabettaV4t1qbit commented 1 year ago

Since version 0.1.7.dev21, reading a formula using pysat.formula.CNF(from_file=<file name>) produces the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[2], line 1
----> 1 formula = CNF(from_file="example_problem.cnf")

File ~/.pyenv/versions/3.9.1/envs/<my-env>/lib/python3.9/site-packages/pysat/formula.py:432, in CNF.__init__(self, from_file, from_fp, from_string, from_clauses, from_aiger, comment_lead)
    429 self.comments = []
    431 if from_file:
--> 432     self.from_file(from_file, comment_lead, compressed_with='use_ext')
    433 elif from_fp:
    434     self.from_fp(from_fp, comment_lead)

File ~/.pyenv/versions/3.9.1/envs/<my-env>/lib/python3.9/site-packages/pysat/formula.py:483, in CNF.from_file(self, fname, comment_lead, compressed_with)
    450 """
    451     Read a CNF formula from a file in the DIMACS format. A file name is
    452     expected as an argument. A default argument is ``comment_lead`` for
   (...)
    479         >>> cnf2 = CNF(from_file='another-file.cnf')
    480 """
    482 with FileObject(fname, mode='r', compression=compressed_with) as fobj:
--> 483     self.from_fp(fobj.fp, comment_lead)

File ~/.pyenv/versions/3.9.1/envs/<my-env>/lib/python3.9/site-packages/pysat/formula.py:523, in CNF.from_fp(self, file_pointer, comment_lead)
    520         elif not line.startswith('p cnf '):
    521             self.comments.append(line)
--> 523 self.nv = max(map(lambda cl: max(map(abs, cl)), itertools.chain.from_iterable([[[self.nv]], self.clauses])))

File ~/.pyenv/versions/3.9.1/envs/<my-env>/lib/python3.9/site-packages/pysat/formula.py:523, in CNF.from_fp.<locals>.<lambda>(cl)
    520         elif not line.startswith('p cnf '):
    521             self.comments.append(line)
--> 523 self.nv = max(map(lambda cl: max(map(abs, cl)), itertools.chain.from_iterable([[[self.nv]], self.clauses])))

ValueError: max() arg is an empty sequence

alexeyignatiev commented 1 year ago

@ElisabettaV4t1qbit, thanks for reporting. At this stage, I can't confirm the issue because formula parsing works fine on my machines. Can you upload your formula so that I see what fails and why?

ElisabettaV4t1qbit commented 1 year ago

Thank you for your answer. I added to my PR the file example_problem.cnf that generates the error I reported above by running:

from pysat.formula import CNF
formula = CNF(from_file="example_problem.cnf")

I have MacOS Ventura 13.5.2, and I am installing python-sat in a virtual environment based on python 3.9.1. Please let me know if you need further information.

alexeyignatiev commented 1 year ago

Thank you, I have checked the file, @ElisabettaV4t1qbit. It does not seem to respect the format, i.e. it is not surprising that the parser fails.

ElisabettaV4t1qbit commented 1 year ago

Are you referring to the last 2 lines in my file?

%
0

After I removed them, the last versions of the package (python-sat==0.1.8.dev9) actually worked. But those lines are present in almost all the instances of one of the most famous and used (to the best of my knowledge) SAT problems library: SATLIB - Benchmark problems. Wouldn't be better if the code allowed more flexibility in the formula format?

alexeyignatiev commented 1 year ago

Well, as far as I can see (I've manually checked a number of benchmark families available there), you are talking about uf* benchmarks only. According to the description, they were generated in 1991, i.e. before the format was proposed (it dates back to 1993). To be honest, I am not entirely convinced that it is better and simpler to support files in a wrong format rather than update the affected files.

alexeyignatiev commented 1 year ago

@ElisabettaV4t1qbit, here is a trivial Perl script that fixes the issue in the files (a Python equivalent is also straightforward):

#!/usr/bin/env perl

while(<>) {
    if (m/^c/) { print; }
    elsif (m/\%/) { last; }
    else { print; }
}

As I mentioned above, I do believe it is a much better practice to fix CNF instances written in a wrong format rather than update the toolkit to support those incorrect formulas.

ElisabettaV4t1qbit commented 1 year ago

Thank you very much for you suggestion!

v-sivak commented 10 months ago

I am a first-timer to SAT problems, and ran into exact same issue with that same dataset. Thank you for clarifying and documenting it here!

pysathq / pysat

Error when reading a CNF formula from file #146