marcthurley / sharpSAT

The #SAT solver sharpSAT
MIT License
55 stars 24 forks source link

Non user-unfriendly code in CNF parser #14

Open msoos opened 2 years ago

msoos commented 2 years ago

Hi,

As you know, I love sharpSAT, however, I recently bumped into this issue, and it's uinfriendly to users, so I suggest changing. Basically, the line here:

https://github.com/marcthurley/sharpSAT/blob/edfbde3424ce17d72d7f8d8f5b8681f2247f4932/src/instance.cpp#L331

makes sure that only nCls clauses are parsed (not precisely, see below). This is an issue, because in case the header says fewer clauses than actually in the file (i.e. incorrect header), sharpSAT will not parse more than nCls clauses, and will likely return a larger count than if it had parsed all clauses. Of course, the CNF is incorrect. The header should be correct, but the user may have made an honest mistake, and sharpSAT could warn them about it. But unfortunately, sharpSAT does not warn, and skips the rest of the CNF :( Many modern SAT solvers either ignore the number of clauses, or exit with error (the kissat and CaDiCaL way is to exit with error).

An insidious bug related to this is that sharpSAT will actually parse more clauses than claimed in the header, in case the clause is of the form a -a ... -- in these cases, skip_clause will be set:

https://github.com/marcthurley/sharpSAT/blob/edfbde3424ce17d72d7f8d8f5b8681f2247f4932/src/instance.cpp#L344

and the clauses_added count will not be increased:

https://github.com/marcthurley/sharpSAT/blob/edfbde3424ce17d72d7f8d8f5b8681f2247f4932/src/instance.cpp#L354

and so sharpSAT will parse more clauses than in the header. Of course, the CNF is incorrect either way, however, this makes it less likely that the user will understand what's happening :disappointed:

I am currently changing GANAK to exit with an error message in these cases. We have been chasing our tails regarding an issue related to a file with incorrect header with GANAK: https://github.com/meelgroup/ganak/issues/21 :sob:

Thanks again for sharpSAT, it's amazing!

Mate

haz commented 2 years ago

Shouldn't there be some sort of quick CNF validator utility that any/all dimacs-based software can front-load? A few lines of properly crafted regex would probably do it...

msoos commented 2 years ago

Yes, it would do it. Also, nobody would use it. You can tell people all day long about how they ought to buy and install a seatbelt in their cars, unless you include it in the car, almost nobody will even install it, much less use it.

The question is not whether it's possible to check this. The question is whether it's user-friendly not to include it out-of-the-box.

I work in IT security, and read a lot about safety critical systems. All scientific literature on the subject points to systems that can be accidentally misused without warning will eventually be accidentally misused, and if that can lead to a catastrophe, catastrophe will happen. It's not "I think" or "I believe", it's just pure science. I suggest reading e.g. "The Field Guide to Understanding Human Error" as an introductory book, and then if you want to go down the deep end of safety science, read some of the books recommended by the author.

haz commented 2 years ago

Meh, alternative is a custom fix on every bit of software that parses dimacs CNF. I'm a big fan of write once, and attempt to use everywhere. If a project isn't going to adopt a freely offered pre-processor to double check the input files, I'd reckon they'd be reluctant to custom build something.

Either way, congrats on understanding/squashing what looks like a brutal bug (without segfaults, what's one to do!). I'm just a long-time lurker when it comes to sharpSAT, so will leave it to @marcthurley to call the shots over here ;).

msoos commented 2 years ago

The fix is less than 5 lines of C that will exit on incorrect CNF header. It's a lot shorter and easier than all the complicated philosophy that we wrote here. Adds nearly zero complexity or overhead. Anyway, it's trivial to fix, the question is, whether we want to put a fence on the balcony at the cost of 2 conditional jumps, or keep the notion that everyone will surely look where they are stepping when out there :)

haz commented 2 years ago

(~5 lines) x (k SAT-based projects that allow buggy DIMACS)

🤷

Over in the planning community, we've started to rally around a central mechanism for fetching / running planning technology. I could see something similar being very useful for SAT-based tools. Such an environment could through the input checker in front of every CNF-based solver without ever having to integrate on them. Calling a SAT solver? Make sure it's first being called on well-formatted input. Want to call it regardless of the checks? Just add the -f option.

As for shartSAT, not sure if @marcthurley 's even around to see these conversations. Been ~3years since the last commit...