Figuring out whether a sentence is in CNF as described by the KCM, where variables can't occur more than once in a clause
Figuring out whether a sentence can be fed into a SAT solver
These are at odds with each other. Near-CNF sentences with duplicate variables in a clause are safe for SAT solvers, but because we use is_CNF() internally, they're not used for those.
I'm working on caching sentence properties, and marking sentences as having properties in advance. The sentences produced by nnf.tseitin can safely be marked as CNF, but the sentences produced by dimacs.load (in CNF mode) can't be.
So a looser alternative to the current is_CNF would be useful, if only for internal purposes. (It would also be faster even without caching.)
The reason for the strict definition in the KCM seems to be to make DNF a subset of DNNF and keep CNF symmetrical with DNF. But outside the KCM, the looser definition of CNF (and DNF) seems to be the common one, e.g. Wikipedia, MathWorld.
I think people would expect is_CNF to use the looser definition after all, at least by default. Could the looser behavior give problems?
Would it be a good idea to give is_CNF and is_DNF a strict=False argument so you need to opt in to the KCM definition?
I think that's viable. If there are places in the code that require the stricter KCM interpretation, then it could use the strict=True directive to ensure that things are as they should be.
The
is_CNF
method has (at least) two uses:These are at odds with each other. Near-CNF sentences with duplicate variables in a clause are safe for SAT solvers, but because we use
is_CNF()
internally, they're not used for those.I'm working on caching sentence properties, and marking sentences as having properties in advance. The sentences produced by
nnf.tseitin
can safely be marked as CNF, but the sentences produced bydimacs.load
(in CNF mode) can't be.So a looser alternative to the current
is_CNF
would be useful, if only for internal purposes. (It would also be faster even without caching.)The reason for the strict definition in the KCM seems to be to make DNF a subset of DNNF and keep CNF symmetrical with DNF. But outside the KCM, the looser definition of CNF (and DNF) seems to be the common one, e.g. Wikipedia, MathWorld.
I think people would expect
is_CNF
to use the looser definition after all, at least by default. Could the looser behavior give problems?Would it be a good idea to give
is_CNF
andis_DNF
astrict=False
argument so you need to opt in to the KCM definition?