is_CNF semantics - Githubissues

The is_CNF method has (at least) two uses:

Figuring out whether a sentence is in CNF as described by the KCM, where variables can't occur more than once in a clause
Figuring out whether a sentence can be fed into a SAT solver

These are at odds with each other. Near-CNF sentences with duplicate variables in a clause are safe for SAT solvers, but because we use is_CNF() internally, they're not used for those.

I'm working on caching sentence properties, and marking sentences as having properties in advance. The sentences produced by nnf.tseitin can safely be marked as CNF, but the sentences produced by dimacs.load (in CNF mode) can't be.

So a looser alternative to the current is_CNF would be useful, if only for internal purposes. (It would also be faster even without caching.)

The reason for the strict definition in the KCM seems to be to make DNF a subset of DNNF and keep CNF symmetrical with DNF. But outside the KCM, the looser definition of CNF (and DNF) seems to be the common one, e.g. Wikipedia, MathWorld.

I think people would expect is_CNF to use the looser definition after all, at least by default. Could the looser behavior give problems?

Would it be a good idea to give is_CNF and is_DNF a strict=False argument so you need to opt in to the KCM definition?

QuMuLab / python-nnf

is_CNF semantics #14