Predelnik / DSpellCheck

Notepad++ Spell-checking Plug-in
GNU General Public License v2.0
198 stars 32 forks source link

Fix crash when opening a large one-line file (500MB) on 32-bit, and improve loading time #329

Open jofon opened 10 months ago

jofon commented 10 months ago

Problem: 32-bit Notepad++ crashes when opening large one-line files. Example issue: https://github.com/notepad-plus-plus/notepad-plus-plus/issues/11427

Steps to reproduce problem: Take the file from https://github.com/notepad-plus-plus/notepad-plus-plus/issues/10407#issuecomment-902697944 Duplicate it's contents on the same line until you have a 500 MB file. Try to open it on 32-bit N++. It should crash.

Problem in code: Calling get_mapped_wstring_range for the entire 500MB line, results in placing 500MB in one single buffer. With 32-bit, this crashes. With 64-bit, it's very slow.

Proposed fix: Merged and modified get_visible_text and underline_misspelled_words Doesn't get the entire line at once, instead it now works in blocks of 4096 characters Takes into account visible lines, horizontal scroll, and the end of the visible text in a line

Changed is_word_under_cursor_correct to check prev token and next token from the current position instead of using the entire line Also added a protection for when the document isn't loaded by N++

Result of fix: No crash, and a loading time of around 30 seconds in Debug. N++ might not render the file, but at least it's not crashing.

Worked on this a month ago, and ended up leaving it on the side when I started looking at other functions that are taking too much into memory, and which also crash (or throw a "bad allocation" exception): erase_all_misspellings get_all_misspellings_as_string mark_lines_with_misspelling

Don't think I'll have time to work on them, though.

Predelnik commented 10 months ago

Thank you very much for all the work, I might not be able to on it in detail currently unfortunately but will try to get to it in reasonable time.

It bothers me slightly that such issues were not reported to plugin's issue list, I don't look at notepad++ own issues that much unfortunately.

Predelnik commented 9 months ago

Strangely enough, I see a bit different results:

Anyway I will try to look on how to fix it, but the fact that 500mb file was successfully opened by you with 32-but N++ seems surprising to me. Maybe the other bit we can do is to disable spell-checking on big or poorly structured files by default and make it enabled only via a special option.