rkd77 / elinks

Fork of elinks
Other
350 stars 38 forks source link

performance slow on pages with large number of links when libcss enabled #317

Closed smemsh closed 1 month ago

smemsh commented 2 months ago

Not sure if this is a real problem or not, but wanted to submit it here in case.

Elinks takes a very long time to render pages with large number of links:

 $ time curl -so this.html https://pypi.org/simple/
time: 4s

 $ grep -c href= this.html
562107

 $ time elinks -dump this.html >/dev/null
time: 151s

 $ time elinks -no-references -no-numbering -dump this.html >/dev/null
time: 149s

Interactively, the page hangs the browser, no links are rendered and nothing can be done, including bring up menu, until it's finished loading. I tried with select() and libevent, same behavior. I'm a little surprised because elinks usually performs super fast on everything and can usually scroll and move around even if the page isn't finished loading.

When using Firefox, the page loads fully in about 20 seconds but I can scroll and get hover icon changes on links quite a bit before that (although it's jerky) and it renders some of the links while the rest loads.

Again not sure if this is an issue, just wanted to bring it to attention in case this is seen as pathological. Half a million links is a lot, but not insane.

rkd77 commented 2 months ago

time ~/bin/elinks --dump this.html > this.txt 2> /dev/null

real 0m18,574s user 0m11,419s sys 0m7,110s

Yes, it is slow, but cheaper would be to get a faster machine. Now scripting is called in dumps, so if scripting is compiled-in, you can save a few seconds if you disable scripting at compile time. valgrind --tool=callgrind elinks --dump this.html > this.txt 2> /dev/null (takes a lot of time) and the kcachegrind to see where the time is "spent".

smemsh commented 1 month ago

My CPU was a 3GHz i7, 32G RAM, so I was surprised your run was so much faster. I also was wondering why the page was not rendering partially as it went, but I suppose it only does that kind of thing when IO is still ongoing? Whereas this was a case were IO was all finished so pure CPU and single thread? Someday I will look into that more.

In any case, the profiler revealed the issue as libcss. With it disabled, the time dropped to 6 seconds, even with native css support enabled. With libcss, it took 156 seconds!

That's quite a difference. The pages look better with libcss, but I'm not sure it's 26x better :-) I think I will disable it! Then again, it's not often to visit pages with half a million links... but good to know!

rkd77 commented 1 month ago

elinks is single-threaded. For networking it reads data in small chunks, so can switch to other subtasks, but for rendering whole document is rendered at once. I have not looked at code, but I guess for dumps original css code was not called. I doubt it is faster than libcss.

smemsh commented 1 month ago

It's the same when viewed interactively in the UI. With document.css.enable = 1 and document.css.libcss = 0, load file, press q<enter> in UI while it's loading, gets back to shell by 7 seconds. With libcss enabled, it takes 157 seconds.

I had thought that elinks could do some partial rendering when it's still waiting for IO, because on some pages, I see partial render and can scroll while still loading. Example page https://www.ti8m.com/blog/Why-Podman-is-worth-a-look-.html but the window is small to observe it.

rkd77 commented 1 month ago

@smemsh could you retest now with libcss enabled. It is not as fast as original, but not 157 seconds.

smemsh commented 1 month ago

seems as fast as original to me, within a few percent:

 $ for exe in `which elinks` build/src/elinks; do for libcss in 0 1; do
       time -p $exe -no-connect \
           -eval "set document.css.enable = 1" \
           -eval "set document.css.libcss = $libcss" \
           -dump ~/this.html >/dev/null; \
       echo exe: $exe, libcss: $libcss; echo; \
   done; done

real 6.92
user 6.13
sys 0.78
exe: /usr/local/bin/elinks, libcss: 0

real 155.57
user 154.14
sys 1.38
exe: /usr/local/bin/elinks, libcss: 1

real 6.54
user 5.86
sys 0.68
exe: build/src/elinks, libcss: 0

real 6.98
user 6.22
sys 0.74
exe: build/src/elinks, libcss: 1

 $ /usr/local/bin/elinks --version |& grep -i git
ELinks 0.18.GIT 90dbe1871d1ec6a6e8591b22caf5096439702b4e

 $ build/src/elinks --version |& grep -i git
ELinks 0.18.GIT 45486090c689ac8971b7e5471c04896cb81d470b

impressive! thanks...