Open phreer opened 1 month ago
Is the main idea to skip rendering pages that aren't visible? There's actually a simpler way to retrieve the currently visible pages directly from pdf.js.
I noticed that when I scroll, annotations on some pages aren't rendered.
Inter-page highlights aren't updated properly re-rendering after modification as well.
Was the rendering slow enough that it prompted you to implement this fix? Please tell more about your system and what PDFs caused issues i.e. OCRed.
Here is my system and document info:
Yes, the primary goal of my pull request is to avoid rendering operations that do not impact the user experience. By reducing unnecessary _render() calls, I aim to enhance the overall performance of the PDF reader in Zotero.
I've experienced that Zotero 7 can sometimes feel laggy, especially when running on a 4K display and zooming into PDFs extensively. In contrast, the native pdf.js implementation in Firefox performs exceptionally well. This discrepancy suggests there is significant room for optimizing Zotero's PDF rendering capabilities. Using Firefox's performance tools, I gathered benchmark data indicating that the render() function consumes a considerable amount of time. Upon reviewing Zotero's codebase, I observed that _render(), being a CPU-intensive function, is invoked frequently—many times unnecessarily. This excessive calling contributes to the laggy experience.
Even with the changes introduced in this PR, performance can still be suboptimal under extreme conditions, such as deep zooming and rapid scrolling. Here is the performance result I obtained:
For reasons that are not entirely clear to me, the reader seems to be heavily utilizing the CPU for data copying operations, which affects overall responsiveness. I would greatly appreciate any guidance or suggestions you might have to further optimize this.
Firefox 128 – Linux – 10_12_2024, 5_17_58 AM UTC – Firefox Profiler.zip
After some reflection and reviewing the codebase, I believe it would be beneficial to separate the rendering content into three distinct levels:
With this structure:
This separation aims to minimize unnecessary rendering operations, thereby improving overall performance. By rendering each layer based on its specific update frequency, we can reduce the computational load and enhance the responsiveness of the PDF reader in Zotero.
I would like to develop a proof of concept to validate this approach. What are your thoughts on this idea?
It's great that you've started investigating Zotero Reader performance, and your suggestion for rendering separation could make sense. However, before implementing anything, we should first compare the performance with pdf.js on Firefox 115 ESR, since that's what Zotero is currently using. And we need to test if there's a performance difference when viewing a PDF file without annotations. Based on those results, we can think about the next steps.
The PDF file you provided has no performance issues for me. How many annotations are you testing it with?
Hi @mrtcode, I am back with some performance data as per your sugguestions.
These are the results I got from Firefox Performance tool when I scrolled up and down, zoomed in and zoomed out with the demo.pdf:
The first one is from reader without modification whilist the second one is from reader not rendering annotations. I added the following change to reader to avoid rendering annotations:
diff --git a/pdfjs/pdf.js b/pdfjs/pdf.js
--- a/pdfjs/pdf.js
+++ b/pdfjs/pdf.js
@@ -1 +1 @@
-Subproject commit 80dd98211fbcdc94ec3a4e7f045d19d069041b1c
+Subproject commit 80dd98211fbcdc94ec3a4e7f045d19d069041b1c-dirty
diff --git a/src/pdf/pdf-view.js b/src/pdf/pdf-view.js
index acbedf6..58113f3 100644
--- a/src/pdf/pdf-view.js
+++ b/src/pdf/pdf-view.js
@@ -612,6 +612,7 @@ class PDFView {
_render(pageIndexes) {
+ return;
for (let page of this._pages) {
if (!pageIndexes || pageIndexes.includes(page.pageIndex)) {
page.render();
What we can see from the results is that when rendering annotaions, it takes a lot of time in CanvasRenderingContext2D.drawImage
, and I can feel the difference of smoothness between these cases.
In addition, I collected some data with perf
for these cases either, which could be a proof of the ineffiency in the current implementation.
Rendering annotions:
Samples: 142K of event 'cpu_core/cycles/P', Event count (approx.): 105326414619
Overhead Command Shared Object Symbol
10.47% CanvasRenderer libc.so.6 [.] __memmove_avx_unaligned_erms
8.64% Isolated Web Co libxul.so [.] 0x0000000003a7709a
5.28% Isolated Web Co libxul.so [.] 0x0000000003a7709e
1.98% swapper [kernel.kallsyms] [k] intel_idle
1.79% Isolated Web Co libxul.so [.] 0x0000000003a770d4
1.70% Isolated Web Co libc.so.6 [.] __memmove_avx_unaligned_erms
1.68% Isolated Web Co libxul.so [.] 0x0000000003a770b2
1.67% Isolated Web Co libxul.so [.] 0x0000000003a770dd
1.63% Isolated Web Co libxul.so [.] 0x0000000003a770c3
1.52% Isolated Web Co libxul.so [.] 0x0000000003a770a2
1.00% Renderer libc.so.6 [.] __memmove_avx_unaligned_erms
0.84% Isolated Web Co libxul.so [.] 0x0000000003a770c7
0.81% CanvasRenderer [kernel.kallsyms] [k] sync_regs
0.80% Isolated Web Co libxul.so [.] 0x0000000002071e21
0.72% CanvasRenderer [kernel.kallsyms] [k] native_irq_return_iret
0.66% CanvasRenderer [kernel.kallsyms] [k] shmem_add_to_page_cache
0.65% CanvasRenderer [kernel.kallsyms] [k] __count_memcg_events
0.51% Isolated Web Co [JIT] tid 19681 [.] 0x000017e20837853d
0.51% Isolated Web Co [JIT] tid 19681 [.] 0x000017e2083785cb
0.49% node libnode.so.127 [.] void v8::internal::MarkingVisitorBase<v8::internal::ConcurrentMarki
0.48% Isolated Web Co [kernel.kallsyms] [k] shmem_add_to_page_cache
0.47% CanvasRenderer [kernel.kallsyms] [k] clear_page_erms
0.47% Isolated Web Co [JIT] tid 19681 [.] 0x000017e2083784d7
0.47% Isolated Web Co [JIT] tid 19681 [.] 0x000017e208378531
0.46% Isolated Web Co [JIT] tid 19681 [.] 0x000017e208378515
0.46% node libnode.so.127 [.] v8::internal::ConcurrentMarking::RunMajor(v8::JobDelegate*, v8::bas
0.45% Isolated Web Co [JIT] tid 19681 [.] 0x000017e20837849f
Not rendering annotations:
Samples: 138K of event 'cpu_core/cycles/P', Event count (approx.): 32891004912
Overhead Command Shared Object Symbol
3.20% swapper [kernel.kallsyms] [k] intel_idle
0.94% gnome-shell libc.so.6 [.] __memmove_avx_unaligned_erms
0.60% gnome-shell libc.so.6 [.] _int_malloc
0.43% Isolated Web Co libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5
0.37% Isolated Web Co firefox [.] free
0.34% node node [.] Builtins_LdaNamedPropertyHandler
0.32% TaskCon~ller #0 libxul.so [.] 0x00000000034a4683
0.31% Isolated Web Co libc.so.6 [.] __GI___pthread_mutex_unlock_usercnt
0.30% swapper [kernel.kallsyms] [k] menu_select
0.28% node node [.] v8::internal::Scanner::Next()
0.26% DOM Worker libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5
0.26% swapper [kernel.kallsyms] [k] cpuidle_enter_state
0.25% gnome-shell libc.so.6 [.] _int_free
0.25% firefox libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5
0.24% WRScene~ilder#1 libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5
0.24% swapper [kernel.kallsyms] [k] update_load_avg
0.23% gnome-shell libglib-2.0.so.0.8200.2 [.] g_hash_table_lookup
0.22% DOM Worker libc.so.6 [.] __GI___pthread_mutex_unlock_usercnt
0.22% glean.dispatche libxul.so [.] 0x00000000040c8e80
0.22% gnome-shell libc.so.6 [.] __libc_calloc
0.20% gnome-shell libglib-2.0.so.0.8200.2 [.] g_pointer_bit_lock_and_get
0.20% swapper [kernel.kallsyms] [k] native_sched_clock
0.20% swapper [kernel.kallsyms] [k] sched_balance_update_blocked_averages
0.19% gnome-shell [vdso] [.] __vdso_clock_gettime
0.18% swapper [kernel.kallsyms] [k] _raw_spin_lock
0.18% gnome-shell libgobject-2.0.so.0.8200.2 [.] g_type_check_instance_is_a
0.18% gnome-shell libc.so.6 [.] malloc_consolidate
Overally, I think we can do better in the performance of Zotero reader. As a user heavily relying on Zotero, I'd like to have performance as good as possible and I am glad to participate in the improvement.
Thanks for investigating! I actually can’t reproduce any performance difference with the rendering function commented out or not. However, I agree that the function is relatively demanding and, if called too often, may cause issues on machines with limited graphics acceleration. In that case, you would likely notice lag when drawing ink annotations or moving/resizing other annotations as well.
I’ll look into reducing the number of calls to the function.
Function
_render()
in PDFView is a time-consuming operation and it's a bit overused .This PR tries to eliminate unnecessary invocatins of the function, which would improve the performance significantly. With this PR, Zotero is much more responsive from my test.