SWI-Prolog / swipl-devel

SWI-Prolog Main development repository
http://www.swi-prolog.org
Other
960 stars 176 forks source link

Stack overflow in term_factorized/2 #1162

Open Jean-Luc-Picard-2021 opened 1 year ago

Jean-Luc-Picard-2021 commented 1 year ago

This is related to this ticket: https://github.com/SWI-Prolog/swipl-devel/issues/1159

An offending test case is:

/* SWI-Prolog 9.1.7 */
?- T = s(s(T, S), _), S = s(S,T), term_factorized(T, _, _).
ERROR: Stack limit (1.0Gb) exceeded

I feel a separate ticket is justified in that a solution to term_factorized/3 might be to circumvent the use of compare/3, as long as there is no compare/3 that satisfies laws necessary to make it work as a basis for rbtrees,

JanWielemaker commented 1 year ago

This issue has been mentioned on SWI-Prolog. There might be relevant details there:

https://swi-prolog.discourse.group/t/history-dependent-semi-lex-compare-3/6431/59

Jean-Luc-Picard-2021 commented 1 year ago

A solution that doesn't use rbtrees is this here, replicating the basic idea of the term_factorized/3 algorithm:

commons.p.log

JanWielemaker commented 1 year ago

This issue has been mentioned on SWI-Prolog. There might be relevant details there:

https://swi-prolog.discourse.group/t/fixing-the-compare-3-mirror-anomaly-in-swi-prolog/6438/4

JanWielemaker commented 1 year ago

So far I've not seen a workable solution. Giving up on algorithms that exploit the total ordering of Prolog terms is not really acceptable. Seems that other systems have a cyclic compare that is a little harder to trick. The fundamental problem doesn't go away though.

I don't know the actual problem you are facing. Possibly '$factorize_term'/3 with the same signature can help. It won't loop. It makes explicit sharing and cycles in the current representation of the term visible, i.e., it doesn't factorize based on ==/2, but on finding the very same term instance in memory.

Using a hash table rather than rbtrees comes to mind as well, but the current term hash algorithms give different hashes for e.g., X = f(X) and X = f(f(X)).

JanWielemaker commented 1 year ago

This issue has been mentioned on SWI-Prolog. There might be relevant details there:

https://swi-prolog.discourse.group/t/cheap-compare-for-cyclic-terms-injective-collation-keys/6427/5

Jean-Luc-Picard-2021 commented 1 year ago

Giving up on algorithms that exploit the total ordering of Prolog terms is not really acceptable.

Maybe people are not aware, that @ridgeworks proposal of representation based comparison is a total order. It only violates lexical ordering, but total order laws will obviously not be violated.

What are total order laws (not violated by all terms):

What is lexical ordering (violated by some cyclic terms):

JanWielemaker commented 1 year ago

I've seen that. I've not followed all details. I have the impression that it still takes quite some effort to turn this into something practical that can compete with compare/3 for performance and be used to replace compare/3 :cry: . If you volunteer to create an efficient implementation in the core system, you are more than welcome. As is, I do not have time for that ...

ridgeworks commented 1 year ago

I have the impression that it still takes quite some effort to turn this into something practical that can compete with compare/3 for performance and be used to replace compare/3 😢 .

Just my perspective:

  1. The current implementation of compare/3is broken with respect to cyclic terms, violating the properties of transitivity and anti-symmetry as discussed. This infects other things like sort and library(rbtrees). I may be wrong but I doubt this has much practical significance. Nevertheless it should at least be documented, as Sicstus appears to do, but the doc changes depend on what, if anything, else is done.
  2. Ideally something should be done to fix system:compare but it's difficult to justify the effort required. If it doesn't adequately support cyclic terms, maybe it should generate an error when it encounters one. (Identity does currently seem to work.)
  3. There appears to be an implementation dependent, i.e., local solution, implemented in Prolog which should satisfy any requirements for this functionality (except perhaps raw performance) should anybody need it. But it wasn't obvious and it took a group effort to sort it out so it would be a shame to lose it. My initial thought was to add a "correct" version of compare/3 (name?) to library(terms) but it's just a thought. It is a fairly thin Prolog wrapper around system:compare so not going to be performance competitive, but probably adequate for most purposes, e.g., with predsort. YMMV.

So I'm not sure what action makes sense given current priorities and skillsets but doing nothing is the worst option (IMO). (I'm happy to contribute but tackling 2. above isn't something I'm comfortable doing.)

JanWielemaker commented 1 year ago

Just my perspective:

I agree. (2) is the way to go, but we need someone with time to do it. Documenting is surely a good step, including a link to the relevant Discourse discussion. Probably we should add the current Prolog code as comment to the C implementation. I'm not sure how useful adding it to the library is. That is also adding confusion and is hopefully just a temporary solution. And yes, ==/2 is (AFAIK) fine.

JanWielemaker commented 1 year ago

P.s. I'm a bit lost in the discussions about correct implementations. Can anyone add links to the key results here?

ridgeworks commented 1 year ago

P.s. I'm a bit lost in the discussions about correct implementations. Can anyone add links to the key results here?

Not surprised, it was a circuitous route to this point. From https://swi-prolog.discourse.group/t/cheap-compare-for-cyclic-terms-injective-collation-keys/6427/6

A non-mathematical perspective: Representation based comparison requires the definition of a mapping predicate from any term to an acyclic equivalent such that the total order of the mapped terms is the same as the order of the original terms. These mapped terms can then be compared using standard ordering, i.e., compare/3 to define the order of the original terms.

The success of this approach depends on the definition of the mapping function. It has to ensure that the standard order of acyclic terms is respected. It also has to address the issues of transitivity and anti-symmetry with cyclic terms that cause issues with the current compare/3. @j4n_bur53’s naish/2 seems to nicely meet both of these requirements ...

You can find the source for naish/2 in rep3.p: https://swi-prolog.discourse.group/t/cheap-compare-for-cyclic-terms-injective-collation-keys/6427/5 It also contains a "correct" alternative to compare/3 named rep_compare. To the best of my knowledge, this represents the best solution to date, i.e., no failures of transitivity or anti-symmetry have been discovered.

Also see https://swi-prolog.discourse.group/t/history-dependent-semi-lex-compare-3/6431/62 for a strategy for implementing naish/2 in C based on copy_term.

While an incorrect compare/3 undoubtedly affects the use of rbtrees with cyclic terms, there may be other issues.

JanWielemaker commented 1 year ago

Thanks. Let me also add a link to Mats Carlson's original findings: