orbitalquark / textadept

Textadept is a fast, minimalist, and remarkably extensible cross-platform text editor for programmers.
https://orbitalquark.github.io/textadept
MIT License
653 stars 39 forks source link

Textadept freezes on large csv file #555

Closed AmerM137 closed 1 month ago

AmerM137 commented 2 months ago

I work a lot with csv outputs from databases and most of the time I like to open them in a text editor because Excel likes to overwrite large integers with scientific notation and other issues. I like how snappy Textadept feels overall except for one peculiar edge case.

I noticed that Textadept comes to a halt when loading such csv files. For comparison, none of sublime/litexl/vscode do that so I am thinking it's trying to do something that those text editors avoid. The file loads just fine, but trying to navigate the file once it's loaded makes Textadept freeze and CPU usage shoots up.

For reference, the typical size of the csv fields are about 100 MB. An example file would be about 100k-200k rows x 75-100 columns. Definitely not the biggest files I've worked with, I've loaded 300 MB+ files in other editors with no issues.

To replicate: Load large csv, data should appear in editor. Try to navigate to middle/end of file using home/end or page up and down the file. CPU usage shoots up, Textadept freezes and becomes unresponsive.

Is there any config parameter that I can use to tell Textadept to not do whatever it's trying to do on the large csv file?

Edit -> OS Win11 v10.0.22631 Build 22631

Thank you!

orbitalquark commented 2 months ago

Do you mind trying to load the file in another Scintilla-based editor like SciTE (https://www.scintilla.org/SciTE.html) or Geany? Scintilla has been known to be a bit laggy with large files (particularly those with long lines). If SciTE or Geany exhibit the same behavior, then I don't think there's anything that can be done. If they are still responsive, then Textadept is to blame and we'll have to investigate further.

AmerM137 commented 2 months ago

Here are the results:

SciTE - Version 5.5.1 Scintilla:5.5.1 Lexilla:5.3.3 No issues whatsoever, it's so fast actually. It's probably the fastest to load the csv file and I can navigate spam from start to end of file with no issues.

Geany - Version 2.0 "Pryce" No issues. It does feel a little choppy and slower than SciTE and other editors but it doesn't freeze at all. Usable for sure.

orbitalquark commented 2 months ago

Thanks for taking the time to investigate. I'm going to venture a guess the lexer is the cause of the problem. Let's try this. Please make a backup of lexers/text.lua and then replace its contents with this:

local lexer = require('lexer')

local lex = lexer.new('text')

lex:add_rule('text', lexer.token(lexer.DEFAULT, lexer.any^1))

return lex

Now try and load your giant CSV file and see if Textadept still stutters while moving through the file.

AmerM137 commented 2 months ago

Same issue after replacing the contents of lexers/text.lua

orbitalquark commented 2 months ago

Hmm, okay. In addition to the lexers/text.lua change, try also adding the following to your ~/.textadept/init.lua:

view.idle_styling = view.IDLESTYLING_NONE
ivoshm commented 2 months ago

Now try and load your giant CSV file and see if Textadept still stutters while moving through the file.

I have a different experience than @AmerM137, I have tested the above patch and moving within the editor with my "giant" text file (30000 lines, 3.5M characters) is suddenly smooth.

Texadept 12.4 GTK3 flavor on Manjaro Linux (XFCE)

AmerM137 commented 2 months ago

That did not fix the issue, unfortunately. Still hanging up on the csv file. I made the change to text.lua and added the mentioned line to my init.lua.

orbitalquark commented 2 months ago

Thanks for the feedback.

Hmm, I'm going to have to try and reproduce this locally then when I have some time. I don't know what else to try :(

That said, I will commit the fix to the text.lua lexer because that's how it should have been to begin with. I'm not sure what I was thinking when I changed it to be what it is now.

orbitalquark commented 2 months ago

I haven't had a chance to look into this yet, but I did commit a fix to the text lexer: https://github.com/orbitalquark/scintillua/commit/c929e9d6072b5963bb595870c33d82ef8a2eb188. It will be in the next nightly build.

It turns out matching whitespace separately is a feature to allow highlighting whitespace separately from text, so I left it in there. There should still be a large performance improvement.

AmerM137 commented 2 months ago

Thanks for the update. I know it may not be fixed yet but I'll give the nightly build a shot.

orbitalquark commented 1 month ago

Okay, I had a little bit of time to look into this. I generated a CSV file with 200k rows consisting of 100 identical fields containing "csv-field" (no quotes). The CSV file was about 195MB in size.

Textadept opened it in about 9 seconds. I immediately used the Search > Go to Line... dialog to jump to 100,000. I started rapidly typing. Textadept hiccuped for about 3-4 seconds, inserting characters slowly. After those brief seconds, it went back to normal speed. I was able to type rapidly anywhere I wanted to: at the ends of lines, at the end of the buffer, at the beginning, and in the middle of lines. I could not get Textadept to exhibit any additional slowness. Navigation was smooth.

I closed Textadept and tried again. Same startup time, and same behavior, except this time I got some lag typing at the end of the file. After 10-15 seconds it went away and I was able to rapidly type anywhere I wanted in the buffer without any lag. Navigation was still very smooth.

This was with the default settings (and updated text lexer). I also tried turning off idle styling, but that didn't seem to affect things.

Then I went back to the old version of the lexer. It was slower than molasses. I could not get to line 100,000 without noticeable lag. Typing was impossible. I quit after a few seconds. It was a night and day difference.

I'm not sure what to make of this (other than the lexer update fixes the major issue). I think if the lexer was still to blame, you'd see the slowness during typing, as that triggers re-lexing. However, I didn't see this after things initially cleared up.

That said, Textadept uses Lua lexers for syntax highlighting, and not Lexilla's C++ lexers (SciTE/Geany), so there is some overhead with string copying, style setting, and what-not over the C++ <-> Lua bridge. I guess this might be one of the trade-offs for flexibility.

Long story short, things are much better with the updated lexer, but it might not fix everything. If it still doesn't work for you, then sorry, I'm not sure what else to do :(

AmerM137 commented 1 month ago

thanks @orbitalquark, I'll do some testing soon and let you know what I see on my end.

AmerM137 commented 1 month ago

I just grabbed the nightly release and tried loading the same ~110mb file that made me notice the lag in the first place. Things are MUCH better! It took a a few seconds to go to the end of the file but after that I was about to type with barely any lag.

While navigating quickly up and down the file with pgup/pgdn, it's not perfectly smooth but it's much improved over v12.4 where the whole editor would freeze for a good 5-10 secs.

Thanks for looking into this @orbitalquark !

orbitalquark commented 1 month ago

Great, thanks for the update! I'll close this now.