ProofGeneral / PG

This repo is the new home of Proof General
https://proofgeneral.github.io
GNU General Public License v3.0
492 stars 89 forks source link

bug: newlines misparsed as space in string litterals #773

Closed erikmd closed 2 months ago

erikmd commented 5 months ago

Reported by @aa755 on Coq-Club:

When I enter the following in a new file, I get:

Require Import Coq.Strings.String.

Definition foo: string := " a ". Set Printing All. Compute foo.

(*
= String (Ascii.Ascii false false false false false true false false)
     (String (Ascii.Ascii true false false false false true true false)
        (String (Ascii.Ascii false false false false false true false false) EmptyString))
: string
*)

The last character of foo is a newline but it is parsed as a space (ascii 32 (0x20)). However, when I compile it using coqc, or interactive run it in jscoq (https://coq.vercel.app/), I get the correct answer:

= String
  (Ascii.Ascii false false false false false true false false)
  (String
  (Ascii.Ascii true false false false false true true false)
  (String
  (Ascii.Ascii false true false true false false false false)
  EmptyString))
: string

-- Abhishek

erikmd commented 5 months ago

FYI the culprit is this line:

https://github.com/ProofGeneral/PG/blob/cb23709ad0c9a9ca0ee48b3ee73c29caea243b98/generic/proof-shell.el#L968-L970

the upside is that this behavior is customizable.

FYI, without changing PG, the following minimal-working-example fixes the issue:

(*
 *  M-: (setq proof-shell-strip-crs-from-input nil) RET
 *)

Definition foo': string :=
" a
".
Set Printing All.
Compute foo'.
     (* = String (Ascii.Ascii false false false false false true false false) *)
     (*     (String (Ascii.Ascii true false false false false true true false) *)
     (*        (String (Ascii.Ascii false true false true false false false false) EmptyString)) *)
     (* : string *)

So as a proper fix @aa755, you can just run:

M-: (customize-save-variable 'proof-shell-strip-crs-from-input nil) RET

to automatically save this setting in your init file.

erikmd commented 5 months ago

Cc @ProofGeneral/core I see that the custom proof-shell-strip-crs-from-input is not documented in the userman:

https://proofgeneral.github.io/doc/master/userman/Variable-Index/#Variable-Index

I also see that easycrypt sets this custom to nil by default:

https://github.com/ProofGeneral/PG/blob/cb23709ad0c9a9ca0ee48b3ee73c29caea243b98/easycrypt/easycrypt.el#L143

So two questions:

Q1. should we just change the default to nil for Coq as well? and look for possible issues in the CI and/or from users? (in which case I'd suggest that we don't close this but pin the issue to be more visible for some time), but let me know if you already see (or remember) potential issues with such a change.

Q2. should we document this custom in PG's userman? saying e.g. "nil" is recommended for String-related formalizations. IMHO I'd suggest to address Q2 (documentation) especially if the answer to Q1 is no.

Other thoughts?

Matafou commented 5 months ago

This setting dates back the times when coqtop would duplicate its prompt after each newline input ad de-synch PG. I don't know exactly when this behaviour disappeared from the coqtop -emacs but it seems quite ancient. So probably worth a try indeed imho.

erikmd commented 5 months ago

OK thanks @Matafou !

FYI I did the following test:

Running docker run --rm -it coqorg/coq:8.4 coqtop -emacs

to download and run the oldest coq prebuilt in Docker-Coq, namely Coq 8.4pl6 as of April 2015.

Then proving a trivial lemma while inserting spurious newlines:

Goal
True.
Proof
.
exact
I.

And I didn't get any duplicate <prompt>.

Does this check looks enough to you?

Kind regards

Matafou commented 5 months ago

Yes. Let us test what the ci says about that.

erikmd commented 5 months ago

FYI: @vzaliva just reported a regression on Zulip that seems to indicate that my bugfix #774 was incomplete. Namely, regarding error reporting for commands that don't fit in a single line.

@ProofGeneral/core I'll try to take a closer look on Monday, but feel free to investigate meanwhile if you can/wish.

Matafou commented 4 months ago

I can't have look now but it is strange: replacing each \n by a space should not change the computation of (char) positions of errors...

monnier commented 4 months ago

I can't have look now but it is strange: replacing each \n by a space should not change the computation of (char) positions of errors...

Could it be some CRLF conversion biting us?

Matafou commented 4 months ago

Had a quick look. Here are the elements:

<prompt>Coq < 12 || 0 < </prompt>Eval compute in
  x * trois * trois.
Toplevel input, characters 18-19:
>   x * trois * trois.
>   ^
Error: The reference x was not found in the current environment.

We need to delete the hackish legacy code for error location (which was using the number of spaces preceding the "^^^" and use the the "characters x-y:" information instead. Which is better anyways. I will try something.

Matafou commented 2 months ago

The error location is now parsed as said above (PR #781) we can close this issue again I guess.