Add `change` option to enable new inference rules

shaunlebron commented 7 years ago

Currently, Parinfer takes as input the full text and cursor position, and outputs new text and cursor position (based on Indent Mode or Paren Mode).

All signs point to the fact that we cannot make anymore improvements without introducing a new option for describing the change that resulted in the current input text, not just the text itself.

// new options API adds `change` object
parinfer.smartMode(fullText, {
  change: {
    position: ...,
    oldText: "...",
    newText: "..."
  }
});

This will enable us to start pondering new methods of inference for:

creating a unified mode (plainly called "infer" mode)
adding inline paren inference (based on insertions and deletions, not just indentation)

cursive-ide commented 7 years ago

I have lots of thoughts on this!

This is a great move, I believe. It's what I've been planning to investigate for Cursive but have not had time to. I suspect it's required for a true hybrid mode, you need more context to figure out what the user actually did. I also believe that this will make parinfer react sensibly to more general changes - I don't think that special casing normal typing, backspace, cut & paste, line delete etc is a good approach, and I'm hopeful that this can react to changes like selecting a block of text and pasting another block of text over it in a sane way.

I also envision this allowing a parinfer mode which works locally, i.e. only over the sexps actually touched by a user change. The auto-formatting on file open is one of the biggest impediments to parinfer adoption by my users. It also means that only the affected sexps are actually required to be correctly formatted - the rest of the file could have totally inconsistent indentation and it wouldn't matter. I envision this never eagerly formatting, but if a change affects an sexp which is inconsistently formatted, parinfer would not process it (assuming it decided that it needed to run indent mode) but would show a warning to the user.

Before you go too far down this road, I'd recommend ensuring that the editors you're interested in actually support listening to document changes like this, and determining which document changes correspond to a particular user action. IntelliJ does (with some coaxing for the mapping of doc events to user actions), but it's considerably more sophisticated than many.

Re: the API, I'd recommend using what IntelliJ does for doc events:

{
  changes: [{
    offset: coordinate,
    oldLength: int,
    newLength: int,
    oldText: "...",
    newText: "..."
  },
  ...]
}

Basically, both before and after will always have the same text offset in the document, but the lengths of the before and after text will be different.

I believe you'll also need to accept an array of changes for a single conceptual user action. Imagine an action where the user wraps a form in parens - you'll get two text insertions then, one for the open paren and one for the close paren. You almost certainly don't want to run parinfer until the whole action is complete, not least because the action may have stored some state internally (i.e. worked out the text range of the form being wrapped) which it will use for the two insertions. If you modify the document while the action is processing serious weirdness will ensue. Similarly, a rename will touch potentially many points in a file (and possibly many files at once, but that's out of scope - parinfer is still per-file).

I suspect you'll find that a lot of this is very editor-dependent, and will drastically increase the complexity of integrating parinfer. Sadly, I also believe it is necessary.

Note: the following only applies if you're interested in applying parinfer partially, i.e. not to the entire document.

The difficulty with multiple changes then becomes identifying the affected sexps. I've been thinking about this, and it's actually a very hard problem. The issue is that each change creates a new "coordinate space", for want of a better term, since the change to the document length means that coordinates from the original document which are after the change location are no longer correct after the change. Each subsequent change operates in the new space created by the previous change. This makes mapping a series of changes back to the corresponding document tricky.

Essentially, the affected range for each change has to be mapped backwards through the previous changes to figure out the affected range in the original doc. Apart from being tricky, this is also inefficient (quadratic in the number of changes, which can be large in e.g. a file reformat). Once you have the ranges in the original doc, they need to be coalesced into non-overlapping ranges and then you identify the sexps which cover those ranges. Then the sexp ranges need to be mapped forward via the list of changes to get the corresponding ranges in the result document, and parinfer would then be run over those ranges.

This seems awfully complex, and I'm hoping there's an easier way to do this that I'm missing - feedback very welcome.

shaunlebron commented 7 years ago

thanks for the thoughts!

format: updated the change object format (I like it!)
editor APIs: I think that change option will only be supported for a new unified "smart" mode
- editors can continue using other modes if their API doesn't support exact change coords
- will look at APIs to see how changes are communicated
batching changes: i'll start with just a single change object for now
- i'll try to work out how to batch multiple changes, but I'm hoping it will just be a matter of applying the first change, updating the subsequent change object coordinates, then applying the next, etc.
partial changes: I'm really tempted to avoid this complexity by pointing to parlinter first, and reevaluating after seeing how many Cursive users find it unsatisfactory.

Pauan commented 7 years ago

@cursive-ide A potential solution is to use relative coordinates, rather than absolute coordinates. Atom uses relative coordinates and B+ trees to dramatically improve the performance of edits.

cursive-ide commented 7 years ago

@shaunlebron I don't think parlinter will help, often people are in a situation where they can't reformat their code (perhaps they're working in a large project with co-workers who are not using parinfer). Just the reformatting creates a fair amount of friction, but coupled with various problems like comments being dragged out of the end of lists makes it a non-starter for many.

cursive-ide commented 7 years ago

@Pauan Thank you! I'll take a look at that.

shaunlebron commented 7 years ago

added a syntax for specifying change diffs in our test cases: https://github.com/shaunlebron/parinfer/tree/master/lib/test/cases#change-diff

shaunlebron commented 7 years ago

@cursive-ide came across a solution to the problem you stated here:

The issue is that each change creates a new "coordinate space", for want of a better term, since the change to the document length means that coordinates from the original document which are after the change location are no longer correct after the change

from Atom's docs on Working with Aggregated Changes:

if you wanted to apply all the changes made in a transaction to a clone of the observed buffer, the easiest approach would be to apply the changes in reverse

shaunlebron commented 7 years ago

changes option added in v3, replacing cursorDx

parinfer / parinfer.js

Add `change` option to enable new inference rules #138