mthom / scryer-prolog

A modern Prolog implementation written mostly in Rust.
BSD 3-Clause "New" or "Revised" License
2.05k stars 121 forks source link

Non-conforming option processing #2015

Closed triska closed 1 year ago

triska commented 1 year ago

I am filing this issue for a discussion I noticed, so that we have a dedicated issue for this problem which is currently mentioned in comments to the commit that introduced a regression in option processing:

https://github.com/mthom/scryer-prolog/commit/42a50474daae81bc396f626faade2b77238417d3

Getting option processing correct is hard. For automated reasoning about programs, it is essential to follow the logic of errors prescribed by the Prolog standard.

We need standard-conforming option processing, also to set a good example for other systems to follow and test against.

pmoura commented 1 year ago

We also need to continue to improve the Prolog standard... before it becomes, in its present form and process, irrelevant and replaced by a better alternative, which is increasingly being discussed among several Prolog implementers.

UWN commented 1 year ago

We also need to continue to improve the Prolog standard

Like SWI?

There are much more important things to look for, think of clpz.

Non-conformity just means that lots of effort is wasted.

The current modification makes Scryer just irregular.

?- read_term(T,[variables([A|As])]).
   error(domain_error(read_option,variables([_41|_42])),read_term/3), unexpected.
?- read_term(T,[variables([A])]).
   waits.

A domain error cannot be issued, when an instance thereof works.

pmoura commented 1 year ago

The current modification makes Scryer just irregular.

?- read_term(T,[variables([A|As])]).
   error(domain_error(read_option,variables([_41|_42])),read_term/3), unexpected.
?- read_term(T,[variables([A])]).
   waits.

A domain error cannot be issued, when an instance thereof works.

Throwing a domain error when a read option argument is bound at call time also results in consistent (and safer) behavior. It could be argued that an uninstantiation error would be more appropriated. But that's a secondary point.

triska commented 1 year ago

A domain error is inappropriate in this case, because an instance of the value is within the domain of the option.

Quoting from the standard:

  c) There shall be a Domain Error when the type of an
  argument is correct but the value is outside the domain
  for which the procedure is defined. 
pmoura commented 1 year ago

Not sure why you’re quoting the standard. Current Scryer behavior is inconsistent as @UWN already illustrated. My proposal (see the link you posted) is to require the options argument to be unbound at call time. In this case, there isn’t any instance of the argument that’s valid.

pmoura commented 1 year ago

Btw, the current Logtalk distribution tests for read options are:

test(lgt_read_term_3_32, error(domain_error(read_option,variables(a)))) :-
    ^^set_text_input('a. '),
    {read_term(_, [variables(a)])}.

test(lgt_read_term_3_33, error(domain_error(read_option,variables([_|a])))) :-
    ^^set_text_input('a. '),
    {read_term(_, [variables([_|a])])}.

test(lgt_read_term_3_34, error(domain_error(read_option,variable_names(a)))) :-
    ^^set_text_input('a. '),
    {read_term(_, [variable_names(a)])}.

test(lgt_read_term_3_35, error(domain_error(read_option,variable_names([_|a])))) :-
    ^^set_text_input('a. '),
    {read_term(_, [variable_names([_|a])])}.

test(lgt_read_term_3_36, error(domain_error(read_option,singletons(a)))) :-
    ^^set_text_input('a. '),
    {read_term(_, [singletons(a)])}.

test(lgt_read_term_3_37, error(domain_error(read_option,singletons([_|a])))) :-
    ^^set_text_input('a. '),
    {read_term(_, [singletons([_|a])])}.

Notice that, in all these tests, the argument of the read options is neither a partial list or a list. A type error could be best here but 8.14.1.3 only specifies a domain error for non-valid options. In all these cases, if the read_term/2-3 predicates implementation doesn't type-checking the options before reading a term, the calls will always fail as the option arguments don't unify with a list (7.10.3). There isn't any usefulness of this behavior; only programming errors.

UWN commented 1 year ago

In all these cases, if the read_term/2-3 predicates implementation doesn't type-checking the options before reading a term, the calls will always fail as the option arguments don't unify with a list (7.10.3). There isn't any usefulness of this behavior; only programming errors.

Not failing in these cases is non-conforming behavior. So you are promoting here non-conforming behavior. In the past all systems conformed here. Now, you suggest some ad hoc modification. It is ad hoc because there is neither a concrete specification (except some test cases) nor a concrete implementation. And you even alluded some different, "simple" behavior.

It means also that a programmer would have to check for that new exception in existing programs. You would consider such programs buggy, but one purpose of standardized behavior is that you can still run existing programs, not only new ones.

Another area where you deny its usefulness is (automated) error diagnosis.

And, apparently, you missed some cases, like

?- read_term(T,[singletons([A,A])]).
a.
   false.

So far at least in that case, all systems agree.

I still hope that Scryer will resort to conforming behavior.

pmoura commented 1 year ago

In all these cases, if the read_term/2-3 predicates implementation doesn't type-checking the options before reading a term, the calls will always fail as the option arguments don't unify with a list (7.10.3). There isn't any usefulness of this behavior; only programming errors.

Not failing in these cases is non-conforming behavior. So you are promoting here non-conforming behavior. In the past all systems conformed here. Now, you suggest some ad hoc modification. It is ad hoc because there is neither a concrete specification (except some test cases) nor a concrete implementation. And you even alluded some different, "simple" behavior.

Tests are not specifications. Those that I posted simply highlight the issue. The simple and safer behavior I'm "promoting" and that I already mentioned twice is to to require unbound read options argument when calling the read_term/2-3 predicates and throw an exception otherwise. That would flag those and other programming errors instead of masking them as failures.

It means also that a programmer would have to check for that new exception in existing programs. You would consider such programs buggy, but one purpose of standardized behavior is that you can still run existing programs, not only new ones.

You mean (and this is just a single example) like the ISO Prolog Core standard changing the meaning of the atom_chars/2 (and number_chars/2) predicates and breaking every existing program that used those predicates?

The specification change I'm proposing it's unlikely to break anything than buggy programs. This comes both from my experience and the experience of the Prolog maintainers with whom I discussed this topic.

Another area where you deny its usefulness is (automated) error diagnosis.

So, a specification change that improves (automatic) detection of programming errors denies the usefulness of (automated) error diagnosis?

And, apparently, you missed some cases, like

?- read_term(T,[singletons([A,A])]).
a.
   false.

No, I didn't missed it. Testing 101: tests can only expose bugs, not prove that they don't exist. But, to the point, the proposed specification change, repeated for the third time above, would result in an error in this case given the bound read option argument.

So far at least in that case, all systems agree.

I still hope that Scryer will resort to conforming behavior.

I still hope that standards can evolve to make Prolog a better language.

UWN commented 1 year ago

Why are your promoting here test cases for domain errors when as you said (and I noted above as well, see my remark you even alluded some different, "simple" behavior) you want again another behavior. So how often do you want that implementations change? That is just contributing to even more instability.

You mean (and this is just a single example) like the ISO Prolog Core standard changing the meaning of the atom_chars/2 (and number_chars/2) predicates and breaking every existing program that used those predicates?

Since we have here lots of readers who are unaware of the history of Prolog, here are the changes to number_chars/2 you refer to. There was not a single system conforming to the state prior to Cor.2. Not one. Now, there are many.

So, a specification change that improves (automatic) detection of programming errors denies the usefulness of (automated) error diagnosis?

A specification change means that automatic detection will have to take into account different behavior on different systems. And here, we had uniform behavior. Recall the wakeup call of Roberto Bagnara in 2000 who was working on program analysis for Prolog, but realized how different systems are which effectively made his work ineffective. Things did improve since then. And I had quite a lot of hope that the new systems are better than the existing ones.

pmoura commented 1 year ago

Why are your promoting here test cases for domain errors when as you said (and I noted above as well, see my remark you even alluded some different, "simple" behavior) you want again another behavior. So how often do you want that implementations change? That is just contributing to even more instability.

You mean (and this is just a single example) like the ISO Prolog Core standard changing the meaning of the atom_chars/2 (and number_chars/2) predicates and breaking every existing program that used those predicates?

Since we have here lots of readers who are unaware of the history of Prolog, here are the changes to number_chars/2 you refer to. There was not a single system conforming to the state prior to Cor.2. Not one. Now, there are many.

The breaking change in the atom_chars/2 (and number_chars/2) predicates in the 1995 first edition of the standard was, as you know it, changing the meaning of char from being a character code to be a single character atom. Since we have here lots of readers who are unaware of the history of Prolog. Your link is about (much later) changes only to the number_chars/2 predicate.

Not criticizing that change (to the meaning of char) per se, just pointing out that in the past ISO made much more significant breaking changes.

So, a specification change that improves (automatic) detection of programming errors denies the usefulness of (automated) error diagnosis?

A specification change means that automatic detection will have to take into account different behavior on different systems.

Not much different in scope and practical consequences from e.g. the fix applied to the specification of the catch/3 control construct years ago (post 1995 edition).

And here, we had uniform behavior. Recall the wakeup call of Roberto Bagnara in 2000 who was working on program analysis for Prolog, but realized how different systems are which effectively made his work ineffective.

Roberto was not the only one doing wakeup calls at the time. In fact, I did in the same before him in the same venue:

@article{pmoura99, author = "Paulo Moura", title = "{Porting Prolog: Notes on porting a Prolog program to 22 Prolog compilers or the relevance of the ISO Prolog standard}", journal = "Association of Logic Programming Newsletter", volume = 12, number = 2, month = may, year = 1999 }

Patricia Hill, the ALP Newsletter editor at the time, forwarded a copy of my article to Roberto (which was cited by him in his article; your link is not the final version of the article). Pity that in that period the newsletters were not archived. There was a followup later by me, however:

https://dtai.cs.kuleuven.be/projects/ALP/newsletter/aug05/index.html

Things did improve since then. And I had quite a lot of hope that the new systems are better than the existing ones.

I know and I share the sentiment (I'm part of a team implementing a new system). I made and continue to make significant contributions for that to happen. This technical discussion aims to just that: point a (yet another) aspect where systems can do better.

triska commented 1 year ago

Thank you a lot!