eraserhd / parinfer-rust

A Rust port of parinfer.
ISC License
546 stars 42 forks source link

parinfer-rust library crashes with memory allocation failure #75

Open justinbarclay opened 4 years ago

justinbarclay commented 4 years ago

While debugging some Changes parinfer-rust-mode was generating I was able to get parinfer-rust, and consequently Emacs, to crash.

After doing some digging around in Emacs, I was able to narrow down the cause to the changes I was passing into parinfer-rust.

I was curious if this was limited to functions used in the interface to Emacs or potentially in the library itself, so I exported a sample Request to json and ran it against the CLI. The request I used can be found below: bad_request.json

cat bad_request.json | RUST_BACKTRACE=1 cargo run -- --input-format json
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/parinfer-rust --input-format json`
memory allocation of 18446744073709551609 bytes failed

Once I confirmed that it was probably an issue in the library itself I started digging into what could cause the error. After doing some trusty print statement debugging, I was able to narrow it down to the add_indent, particularly at line 1433, where casting from Delta to Column occurs (i64 -> usize). In the request I had generated it had values of orig_indent of 13 and a delta of -15. So when you add these two numbers together it comes out to -2 and when you type cast that to an usize it becomes a really large number(18446744073709551609). Then when we pass this number, as new_indent, into repeat_string, it tries to allocate enough space to repeat the string 18446744073709551609 times. This causes my computer to get really mad at me.

I realize that this shouldn't happen if I correctly detected and reported my changes in smart-mode, but I thought I would warn you of the possibility of a crash anyways. :)

eraserhd commented 4 years ago

Thanks for investigating this! (And sorry about the delay, I've been pretty busy lately.)

My initial thought is that something is wrong with the "changes" list, as this is part of how the delta is computed. I think your solution might fix the symptom, but at that point in the program it is really saying "from here on, the source code should be indented -2 spaces", which is clearly not correct and should not happen.

I'm really curious what kind of operation you were performing on the text that caused so many simultaneous, disparate changes. Some of the changes seem to reference areas well past the end-of-line. The first one, for example, deletes a lone parenthesis from column 76 of line 20, which is now 46 characters long, meaning the scan will never hit this column, and the delta will never get adjusted (although in this case, it would adjust in the wrong direction to cause this problem).

eraserhd commented 4 years ago

One point IIRC is that parinfer wants the changes array to describe "simultaneous" changes rather than "sequential" changes, meaning that all changes use pre-change cursor coordinates.

justinbarclay commented 4 years ago

Thanks for getting back to me! (Please don't apologize about any delay. I appreciate that we're all busy individuals) 😊

Yeah, I definitely agree, something is wrong with the way I generate the change list. Particularly, I think it's how Emacs thinks of changes and how parinfer expects changes. Part of it is how Emacs scripts perform changes in a buffer, which is programmatically and one line at a time. This is opposed to what I expect to see in programming languages in generally: perform operations on the string holistically and then replace it. This combined with Emacs default to have as fine grained change reporting as possible, means that some simple changes to a file can cause a lot of changes to appear in the change list. For example, a change that effects 1 large contiguous region of a buffer or file would be reported as many changes to that file (from my experience roughly 1 per line).

My intention with this bug report/pull request was two fold;

  1. Reduce the surface for the library to crash, while staying within spec.
    • This fix was inspired by parinfer-js and it's original implementation for repeat string
  2. Come up with a better structure for the changes that are passed into parinfer-rust.

    • For this I was thinking to "fix" how Emacs track changes. My thoughts are to implement a system that will cluster changes that are within one line of one another into a single change. For example, say there are changes that happen on lines 2, 3, 4, and 6. This system would then become a change on line 2 spanning until some point in line 4 and the change on line 6 would remain independent of the rest. Thus reducing the total amount of changes reported from 4 to 2.

    As for the specific operation that caused this, I had my cursor at the end of a line and was "barfing" a paren, using paredit, to that point