randy3k / RemoteSubl

Use rmate with Sublime Text, an improved fork of rsub.
327 stars 20 forks source link

How to improve the speed of opening big files #8

Open mihan007 opened 5 years ago

mihan007 commented 5 years ago

I've installed Sublime and server parts. Works like a charm for small files (tried for nginx config) but when I try to rsubl big_log_file.log it do nothing. Tried for log about 12mb.

What is limits for file size?

mihan007 commented 5 years ago

Hm, it opened big log files but it took more than 5 min (I've already post the issue and forgot about it when sublime shows it). By scp it took 2 seconds to download that log file. So I need to clarify the issue: how to improve the speed of opening big files?

randy3k commented 5 years ago

I guess the reason is because we are parsing the stream line by line. It may be more efficient to parse multiple lines at a time, especially for the file data part.

MatthiasPortzel commented 1 month ago

Python text concatenation requires copying the entire string each time. This is O(n) on the length of the string. Since this plugin goes line by line and concats each line to a buffer, the number of concatenations done is also O(n). This means that copying a file is O(n*n). This is very slow when opening files with 100,000 lines (like binary disassembly for example).

Compare these examples:

Building up a string with a string buffer, takes 308 seconds.

data = b""
for i in range(100000):
    data += FILLER_TEXT.encode("utf-8")

Building up an array and concatenating once at the end, takes 0.1s.

data = []
for i in range(100000):
    data.append(FILLER_TEXT.encode("utf-8"))
b"".join(data)

This plugin currently uses the first method. I have a basic patch on my fork which uses to the second method. I'm not interested in opening a PR right now; I just got very distracted trying to open a file :) But my code can be considered public domain if someone else wants to incorporate this change.

https://github.com/MatthiasPortzel/RemoteSubl/commit/98786ca6b1830afd1000ba5552b4bcea0874ce22

You can't get faster than O(n), of course, but If you wanted to do it right, you would write each chunk into the file as it came in; this would allow your memory use to be O(1) instead of O(n).

randy3k commented 3 weeks ago

Thanks @MatthiasPortzel I have merged your changes into master.

MatthiasPortzel commented 3 weeks ago

Thanks for your work maintaining this package @randy3k!