parinfer / parinfer.js

Let's simplify the way we write Lisp
https://shaunlebron.github.io/parinfer
MIT License
1.75k stars 40 forks source link

Problem with indent mode in process-buffer branch #171

Closed cursive-ide closed 1 year ago

cursive-ide commented 7 years ago

I've been working on a fork of parinfer over here. Currently parinfer works like this:

  1. Client passes entire buffer to parinfer as a string.
  2. Parinfer splits the buffer into strings, one for each line, and then processes those lines independently mutating them as required.
  3. Parinfer then joins the resulting lines together into a single string and returns that as the result of the operation to the client.

This forces the client to serialise the entire document as a string. Any non-trivial editor will not store the whole doc as a string, and forcing it to do that is inefficient and produces a lot of memory garbage. The same is true of splitting the document into lines and then joining them into a single string again. V8 is actually extremely efficient at this, but the JVM is less so. I assume those operations are less optimised because the JVM offers ways to construct strings via mutable buffers, and JavaScript does not. Because of this, the implementation in Cursive works in a different way:

  1. Cursive passes a CharSequence representing the document to parinfer - this is an abstraction allowing iteration over a sequence of characters, but does not force serialisation to a string. IntelliJ will efficiently iterate its document tree behind the scenes.
  2. The parinfer implementation iterates the whole sequence at once rather than splitting it into lines. When edits to the document are required, instead of applying them they are collected in a list.
  3. Parinfer then returns the list of edits to Cursive, and these are applied to the document using the standard document functions insertString, deleteString and replaceString. This is much more efficient, especially when the edits are small compared to the total document size, which is usually the case with parinfer.

My fork is a WIP to bring those changes into parinfer proper. These same efficiency problems also affect Emacs, and I imagine other editors as well, so this might be a good interface for the main implementation of parinfer. Doing this also avoids the need for maintaining the output cursorX since the cursor will be naturally moved around while applying the edits.

All tests now pass except for one indent mode case, which I don't have a good solution for so I'm documenting it here before Slack eats it.

It's an indent mode case: (foo}}}}|). The paren trail is correctly initialised to 4-9, which covers all the closing parens, including mismatched. This differs from the reference implementation, which used to delete unmatched closing parens as it encountered them. But clampParenTrailToCursor is now doing the wrong thing. It clamps the paren trail to 8-9, which is right, but then it scans over 4-8 and removes 4 items from parenTrail.openers, even though those are unmatched. There’s only one item in openers, since there’s only one open paren, so what happens is the output is (foo| instead of (foo|).

I’m not sure how to handle this - if the unmatched parens aren’t deleted immediately, I’m not sure how to track which are matched and which are not. I don’t think the paren trail start can be moved, since matched and unmatched might be mixed: ((foo)}). I tried using openers to figure out which were matched or not:

for (i = result.lineStart + startX; i < result.lineStart + newStartX; i++) {
  var ch = text[i];
  if (isCloseParen(ch) && MATCH_PAREN[peek(openers, removeCount).ch] === ch) {
    removeCount++;
  }
}

But that broke some other clamping cases. I'm documenting this here since this is the one case I don't have a good solution for with the new model - all other tests pass, and it seems to work well in the demo editor.

It's worth noting that this isn't a problem if parinfer just bails on encountering an unbalanced close paren and allows the editor to place an error mark there - in that case, this is a non-issue.

shaunlebron commented 1 year ago

Hi Colin 👋 I’m cleaning up issues here. Please re-open if this is still relevant, and I can take a look again :)