Individual line comments disrupt interactive mode

parno commented 6 years ago

The code below has a comment in the midst of a match statement. From the command line, F is perfectly happy. In interactive mode, F reports an incomplete match. Interestingly, the issue disappears if I indent the comment.

In case it's useful, I've included the source file IncompleteSmall.txt and an IDE transcript test.txt (with renamed file extensions -- thanks GitHub).

module IncompleteSmall

type ins =
  | A      : a:int -> ins
  | B      : b:int -> ins

let eval_ins (ins:ins) : int =
  match ins with
  | A a -> 0

// Useful comment here

  | B b -> 1

cpitclaudel commented 6 years ago

Thanks. That's a known issue (design decision? ^^)

It boils down to segmenting F code into individual definitions. The question is: how do you tell what's a unit of F code? The CLI gets to run the F* parser to determine fragment boundaries, but we don't have that luxury in the interactive mode, so the current approach looks for sigelt starters in column 0, or comments, preceded by empty lines (which is why indenting the comment or removing the blank line before it solves the issue)

One solution could be to stop recognizing comments in column zero as block starters, but then you run into this issue:

// Some witty comment here
let a = 1

Now // isn't a block starter anymore, but neither is let a = 1. We can of course start recognizing all lets in column 0 as definition starters, but now we have another issue:

let x =
let y = 1 in
x + x

… this gets split as let x =, and then a separate fragment starting at let y =.

All of this isn't ideal. I've added a segment command to the IDE mode which takes a fragment of code and cuts it up into fragments — I'd love to use that, but the problem is that often when you do a push F is busy, so you can't call it to ask it to segment your file. You could have a second instance of F running just for parsing purposes, but that seems heavy-handed.

Note though that you can use C-u C-c C-RET to process up to the current point as one unit.

Let me know if you have ideas to make this better.

parno commented 6 years ago

Just to make sure I'm following you, it sounds like you're attempting to do a lightweight parse of F* code in order to segment it into definitions. The current heuristics for parsing are fooled by a column 0 comment.

If that's correct, then it sounds like you can either continue to tweak the heuristics or integrate the F parser in some form (perhaps someday we could extract a version of F that only includes the parser?). The former seems like it will always be subject to some mismatch with F*'s version, so maybe the best thing you can do is document the heuristics somewhere and leave it at that.

cpitclaudel commented 6 years ago

Just to make sure I'm following you, it sounds like you're attempting to do a lightweight parse of F* code in order to segment it into definitions

Correct

The current heuristics for parsing are fooled by a column 0 comment.

Yes, when preceded by a blank line

or integrate the F parser in some form (perhaps someday we could extract a version of F that only includes the parser?).

I'd need an Elisp version, though, not OCaml

The former seems like it will always be subject to some mismatch with F*'s version

Ah, good point, yup.

maybe the best thing you can do is document the heuristics somewhere and leave it at that.

OK, will do.

parno commented 6 years ago

I'd need an Elisp version, though, not OCaml

F* already has two backends -- how hard could a third be? :)

cpitclaudel commented 6 years ago

F* already has two backends -- how hard could a third be? :)

Right, and since ELisp has neat and clean semantics, translating will be a breeze.

parno commented 6 years ago

Assuming an ELisp backend might take a few days, should we close this issue out for now?

theolaurent commented 6 years ago

The case of mutually recursive definitions can be quite annoying, and puzzling to the unaware user (just ran into it last week and was planning to open an issue).

// some
let rec foo () : ML unit = bar ()

// thing
and bar () : ML unit = foo ()

BTW, couldn't you just consider lines with only comments equivalent to empty ones?

cpitclaudel commented 6 years ago

The case of mutually recursive definitions can be quite annoying

An easy solution is to write it like this:

// some
let rec foo () : ML unit = bar ()
// 
// thing
and bar () : ML unit = foo ()

or like this:

// some
let rec foo () : ML unit = bar () 
// thing
and bar () : ML unit = foo ()

or like this:

let rec foo () : ML unit =
  // some
  bar () 
and bar () : ML unit = 
  // thing
  foo ()

Having to use workarounds isn't great, of course.

and puzzling to the unaware user

Yup, I think that's the main problem :/

BTW, couldn't you just consider lines with only comments equivalent to empty ones?

It's not that easy :/ Comments before a definition are part of the same logical block as that definition, so the start of a block really is the comment, not the let.

It's always possible to improve the heuristic, of course… can you try adding the following to your .emacs? (@parno, it would be nice if you could give it a try too)

(require 'fstar-mode)

(defun fstar-subp--likely-block-start-p ()
  "Check whether the current match looks like a block start."
  (and
   ;; Skip block starters in comments
   (not (fstar-in-comment-p (match-beginning 1)))
   ;; Skip false positives introduced by considering ‘^//’ a block starter
   (save-excursion
     (goto-char (match-beginning 0))
     (while (forward-comment 1))
     (skip-chars-backward fstar--spaces)
     (looking-at-p fstar-subp-block-sep))))

(defun fstar-subp--block-start-p ()
  "Check whether the current match is a valid block start."
  (and (not (= (match-beginning 1) (point-max)))
       (fstar-subp--likely-block-start-p)))

(defun fstar-subp--block-end-p ()
  "Check whether the current match is a valid block end."
  (and (not (= (match-beginning 0) (point-min)))
       (fstar-subp--likely-block-start-p)))

cpitclaudel commented 6 years ago

Ping. I'd like to push the new heuristic, but having a regular user test it first would be helpful

parno commented 6 years ago

Hi Clement,

I've been trying out the patch above. The good news is that it correctly handles the small example from my original post, as well as the larger version from which it originated. I just bumped into an example though where it's a bit too generous. In the code below, if the cursor is inside lemma_expand_key_256 and I try 'Ctrl-C Ctrl-N', it jumps to the end of the entire code snippet, not just lemma_expand_key_256. Without the patch, it does the expected thing of only going to the end of lemma_expand_key_256. Sorry it's not a very minimal example, but I thought you might be able to see the issue just based on the pattern of comments. If not, I can try to minimize it.

#reset-options "--z3rlimit 10"
// quad32 key expansion is equivalent to nat32 key expansion
let rec lemma_expand_key_256 (key:aes_key_LE AES_256) (size:nat) : Lemma
  (requires size <= 15)
  (ensures (
    let s = key_schedule_to_round_keys size (expand_key AES_256 key 60) in
    (forall (i:nat).{:pattern (expand_key_256 key i)} i < size ==> expand_key_256 key i == s.[i])
  ))
  =
  lemma_expand_append key (4 * size) 60;
  // assert (equal (expand_key AES_256 key (4 * size)) (slice (expand_key AES_256 key 60) 0 (4 * size)));
  if size = 0 then ()
  else
  (
    let i = size - 1 in
    lemma_expand_append key (4 * i) 60;
    lemma_expand_key_256 key i;  // Prove the quantifier above for i < size - 1
    // Prove for i == size - 1
    (if i = 0 || i = 1 then (lemma_expand_key_256_0_corollary key)
     else (
       lemma_expand_key_256_i key i; // ==> 
       // round_key_256 prev0 prev1 i ~= (expand_key AES_256 key (4*i + 4)).[4*i, 4*i + 1, ... 4*i + 3]
       //                             ~= (expand_key AES_256 key (4*i + 4)).[4*size - 4 ... 4*size - 1]
       //                             ~= (expand_key AES_256 key  (4*size)).[4*size - 4 ... 4*size - 1]

       // Definition of expand_key_256:
       // expand_key_256 key i ==
       //    round_key_256 (expand_key_256 key (i - 2)) (expand_key_256 key (i - 1)) i

       let s = key_schedule_to_round_keys size (expand_key AES_256 key 60) in
       // s = let round_keys = key_schedule_to_round_keys (size - 1) w in
       //     let rk = Mkfour w.[4 * size - 4] w.[4 * size - 3] w.[4 * size - 2] w.[4 * size - 1] in
       //     append round_keys (create 1 rk)
       // s.[i] == s.[size - 1] 
       //       ~=  Mkfour w.[4 * size - 4] w.[4 * size - 3] w.[4 * size - 2] w.[4 * size - 1]

       // Following is from: lemma_expand_append key (4 * i) 60;
       //assert (equal (expand_key AES_256 key (4 * i)) (slice (expand_key AES_256 key 60) 0 (4 * i)));

       let w = expand_key AES_256 key 60 in
       assert (w.[4 * size - 4] == (expand_key AES_256 key (4*size)).[4 * size - 4]);
       assert (w.[4 * size - 3] == (expand_key AES_256 key (4*size)).[4 * size - 3]);
       assert (w.[4 * size - 2] == (expand_key AES_256 key (4*size)).[4 * size - 2]);
       assert (w.[4 * size - 1] == (expand_key AES_256 key (4*size)).[4 * size - 1]);

       admit();
       ()
     )
    )     
  )
#reset-options

// Refine round_key_256 to a SIMD computation
let simd_round_key_256 (prev:quad32) (rcon:nat32) : quad32 =
  let r = rot_word_LE (sub_word prev.hi3) *^ rcon in
  let q = prev in
  let q = q *^^ quad32_shl32 q in
  let q = q *^^ quad32_shl32 q in
  let q = q *^^ quad32_shl32 q in
  q *^^ Mkfour r r r r

// SIMD version of round_key_256 is equivalent to scalar round_key_256
let lemma_simd_round_key (prev:quad32) (rcon:nat32) : Lemma
  (True) // (simd_round_key_256 prev rcon == round_key_256_rcon prev rcon)
  =
  commute_rot_word_sub_word prev.hi3;
  Types_i.xor_lemmas ()

cpitclaudel commented 6 years ago

Thanks a lot. Here's an updated patch:

(require 'fstar-mode)

(defconst fstar-subp-block-start-re
  (format "\\`\\|%s\\(?:\\'\\|^\\(?:%s\\)\\)" fstar-subp--blanks-re fstar-syntax-block-start-re))

(defun fstar-subp--likely-block-start-p ()
  "Check whether the current match looks like a block start."
  (and
   ;; Skip block starters in comments
   (not (fstar-in-comment-p (match-beginning 1)))
   ;; Skip false positives introduced by considering ‘^//’ a block starter
   (save-excursion
     (goto-char (match-beginning 0))
     (while (forward-comment 1))
     (skip-chars-backward fstar--spaces)
     (looking-at-p fstar-subp-block-start-re))))

(defun fstar-subp--block-start-p ()
  "Check whether the current match is a valid block start."
  (and (not (= (match-beginning 1) (point-max)))
       (fstar-subp--likely-block-start-p)))

(defun fstar-subp--block-end-p ()
  "Check whether the current match is a valid block end."
  (and (not (= (match-beginning 0) (point-min)))
       (fstar-subp--likely-block-start-p)))

parno commented 6 years ago

I tried this out on the file with the complicated example above, and it now behaves as expected. I've done a small amount of other F verification work after applying the patch, and I haven't seen any other unexpected behavior yet, but it's a fairly small sample. (Sorry for the delayed response -- I've been traveling and working on Vale stuff instead of F.)

cpitclaudel commented 6 years ago

Thanks for confirming!

FStarLang / fstar-mode.el

Individual line comments disrupt interactive mode #88