Text Reflowing/Justification

oldaccountdeadname commented 3 years ago

This fork introduces 'justification' functionality, where a line may be reformatted from

very long paragraph with no linebreaks, exceeding the eighty character line limit of most everywhere

to

very long paragraph with no linebreaks, exceeding the eighty character line
limit of most everywhere

The command will justify around comments, for instance

# very long paragraph with no linebreaks, exceeding the eighty character line limit of most everywhere

to

# very long paragraph with no linebreaks, exceeding the eighty character line
# limit of most everywhere

Line length is determined by the line_length_guide property in the preferences, or a hard-coded value of 80 if the guide is not present. Characters that count as a comment (referred to as a prefix in the justify_text function) are determined by a hard-coded regex, as the line_comment_prefix option is limited to only one style (and thus would break on, for instance, rustdoc comments).

This addresses #219.

oldaccountdeadname commented 3 years ago

I'm in agreement with @christoph-heiss's feedback, and I've added some of my own as well. That said, this is a feature I didn't know I wanted until this PR; I'd love to get this merged if you have the bandwidth to revisit these suggestions! grin

Yes - thanks so much for the feedback! I'm looking back at the code I wrote a few months ago, and just started to rewrite it because, as I'm sure you've noticed, it's pretty bad. I've got a justify function that works properly on strings, but I'm running into some issues when I try to get the contents of a selection.

I wrote a sel_to_range function that takes an &mut Application and attempts to get the selection Range, bail!ing if something goes wrong. However, buffer.read() returns None, implying that I'm getting an improper range. It looks like this:

fn sel_to_range(app: &mut Application) -> std::result::Result<Range, Error> {
    let buf = match app.workspace.current_buffer() {
    Some(b) => b,
    None => bail!(BUFFER_MISSING),
    };

    match app.mode {
    Mode::Select(ref sel) => Ok(Range::new(buf.cursor.position, sel.anchor)),
    Mode::SelectLine(ref sel) =>  Ok(sel.to_range(&*buf.cursor)),
    _ => bail!("Can't access a selection outside of select mode.")
    }
}

Then, when I call that with the below, there's a panic due to read(&sel) returning None.

let sel = sel_to_range(app)?;
let buf = app.workspace.current_buffer().unwrap(); // above call guaruntees existence of buffer
let txt = buf.read(&sel).unwrap();

If it's not too much trouble, do you think you'd be able to tell me what's going on here? I'd assume that sel_to_range is returning bad ranges, but I'm not quite sure how it's doing that.

In addition, I'm not quite sure how this should get at prefixes. There are two types of prefixes I'm identifying:

'stable' prefixes: these stay the same on every line, i.e., #, //, and ///.
'variable' prefixes: these are not necessarily the same throughout a paragraph, i.e., /* */ comments, where each line after the first begins with a * instead of a /*.

I believe that supporting just the first would require some modifications to the configuration API, as the line_comment_prefix only allows for one token, and would therefore disallow multiple tokens for a file, such as rustdoc comments.

Supporting the second would be tricky too, as the pattern parameter would have to have state. In a vacuum, I'd go about that by defining a trait Prefix requiring matches(impl AsRef<str>) -> bool and written_prefix() -> String function so that an object can be constructed to match a prefix and say what it should look like at the beginning of a line at any given time, but I'm not at all sure how to reconcile that with Amp's line_comment_prefix configuration API. Thoughts?

jmacdonald commented 3 years ago

If it's not too much trouble, do you think you'd be able to tell me what's going on here? I'd assume that sel_to_range is returning bad ranges, but I'm not quite sure how it's doing that.

That logic mirrors what's used for the commands::selection::delete method, which works fine. The range value depends on the text in the buffer, the select mode you're in, and the range you have selected. To start, I would suggest you debug the returned range so you can see if it matches your expectations. A simple solution would be to print the range to stderr, redirect stderr to a file when running amp, and then tailing that file. Then you can see the range values while playing with the selection in the app; you might have stumbled onto an edge case with that logic.

I believe that supporting just the first would require some modifications to the configuration API, as the line_comment_prefix only allows for one token, and would therefore disallow multiple tokens for a file, such as rustdoc comments.

What if we opted for a simple heuristic to start? Something like "if there is a leading symbol/non-alphanumeric character, use that as the prefix". That won't cover all use cases, but if it gets us 90% without needing to use lexical scoping rules and syntax definitions to determine the comment prefix, I think that could be sufficient. That also wouldn't support variable prefixes, but FWIW, I don't use those often enough to see that as a major shortcoming, personally.

oldaccountdeadname commented 3 years ago

That logic mirrors what's used for the commands::selection::delete method, which works fine.

Not sure what I was doing incorrectly, but I extracted the code from delete and all the tests pass.

What if we opted for a simple heuristic to start? Something like "if there is a leading symbol/non-alphanumeric character, use that as the prefix". That won't cover all use cases, but if it gets us 90% without needing to use lexical scoping rules and syntax definitions to determine the comment prefix, I think that could be sufficient.

Yep, that seems like a good solution. I can't really think of any alphanumeric prefixes that would break this, so I'll start implementing that!

That also wouldn't support variable prefixes, but FWIW, I don't use those often enough to see that as a major shortcoming, personally.

I personally use /* * */ style comments pretty frequently, and would want to explicitly support them. I agree that the closure solution is probably an over-complication, though. What would you think of a hard-coded dictionary of first line prefix: (middle line prefix, last line prefix) to handle these things?

For example, to support some of the more common comment styles, we could have:

{
        "/*":     (" *", "\n */"), // c-style multiline
        r#"""""#: ("", r#"""""#),  // python-style multiline
        "#+BEGIN_COMMENT": ("", "\n#+END_COMMENT"), // org-mode multiline
        ...
}

I'd imagine it could be pretty small (and thus hard-coded), and, if it's ever too small, woud be able to be refactored into the main config file without too much effort. How's this?

The code I've got going for this version is available on gitlab just so that I don't overwrite the previous fork yet.

-- lincoln auster [they/them]

jmacdonald commented 3 years ago

I think I'd prefer to table multi-line comment support for now. I could only find one reflow plug-in for Sublime Text, and it too doesn't support multi-line comments.

The complexity that this functionality would add to Amp wouldn't be worth it, in my opinion. We'd be forced to add another per-type configuration with a complex syntax that users are unlikely to configure (or that we are forced to maintain). Working with multi-line comments is still possible, provided the opening and closing symbols are put on dedicated lines (making the comment content appear without prefixes). In my experience, that's what most best practices encourage, anyway. It does mean that single-line comments in HTML and CSS would need to be manually converted to multi-line beforebeing able to be reflowed, but I think it's okay if the implementation doesn't cover that case.

oldaccountdeadname commented 3 years ago

Okay, sounds good! The code on gitlab is feature complete. I'll move that over to github, close this PR, and open one for that.

jmacdonald / amp

Text Reflowing/Justification #220