PDF: Performance improvement

phreer commented 1 month ago

Function _render() in PDFView is a time-consuming operation and it's a bit overused .This PR tries to eliminate unnecessary invocatins of the function, which would improve the performance significantly. With this PR, Zotero is much more responsive from my test.

mrtcode commented 1 month ago

Is the main idea to skip rendering pages that aren't visible? There's actually a simpler way to retrieve the currently visible pages directly from pdf.js.

I noticed that when I scroll, annotations on some pages aren't rendered.

Inter-page highlights aren't updated properly re-rendering after modification as well.

Was the rendering slow enough that it prompted you to implement this fix? Please tell more about your system and what PDFs caused issues i.e. OCRed.

phreer commented 1 month ago

Here is my system and document info:

Operating System: Debian Sid
Processor: Intel Core i7 11900K
RAM: 32GB
Graphics: Intel ARC A-750
Test Documents: Well-formatted PDFs such as the Vulkan Specification (https://registry.khronos.org/vulkan/specs/1.3/pdf/vkspec.pdf)

Yes, the primary goal of my pull request is to avoid rendering operations that do not impact the user experience. By reducing unnecessary _render() calls, I aim to enhance the overall performance of the PDF reader in Zotero.

I've experienced that Zotero 7 can sometimes feel laggy, especially when running on a 4K display and zooming into PDFs extensively. In contrast, the native pdf.js implementation in Firefox performs exceptionally well. This discrepancy suggests there is significant room for optimizing Zotero's PDF rendering capabilities. Using Firefox's performance tools, I gathered benchmark data indicating that the render() function consumes a considerable amount of time. Upon reviewing Zotero's codebase, I observed that _render(), being a CPU-intensive function, is invoked frequently—many times unnecessarily. This excessive calling contributes to the laggy experience.

Even with the changes introduced in this PR, performance can still be suboptimal under extreme conditions, such as deep zooming and rapid scrolling. Here is the performance result I obtained:

Screenshot from 2024-10-12 13-27-12

For reasons that are not entirely clear to me, the reader seems to be heavily utilizing the CPU for data copying operations, which affects overall responsiveness. I would greatly appreciate any guidance or suggestions you might have to further optimize this.

Firefox 128 – Linux – 10_12_2024, 5_17_58 AM UTC – Firefox Profiler.zip

phreer commented 1 month ago

After some reflection and reviewing the codebase, I believe it would be beneficial to separate the rendering content into three distinct levels:

Document Layer: Handles static document content that does not change frequently.
Annotation Layer: Manages annotations that change infrequently.
Volatile Layer: Manages dynamic content such as selection highlights and overlay popups.

With this structure:

Document Layer: Rendered only once since its content remains static.
Annotation Layer: Rendered only when there are changes to annotations.
Volatile Layer: Designed to be lightweight and rendered quickly to handle dynamic interactions efficiently.

This separation aims to minimize unnecessary rendering operations, thereby improving overall performance. By rendering each layer based on its specific update frequency, we can reduce the computational load and enhance the responsiveness of the PDF reader in Zotero.

I would like to develop a proof of concept to validate this approach. What are your thoughts on this idea?

mrtcode commented 1 month ago

It's great that you've started investigating Zotero Reader performance, and your suggestion for rendering separation could make sense. However, before implementing anything, we should first compare the performance with pdf.js on Firefox 115 ESR, since that's what Zotero is currently using. And we need to test if there's a performance difference when viewing a PDF file without annotations. Based on those results, we can think about the next steps.

The PDF file you provided has no performance issues for me. How many annotations are you testing it with?

phreer commented 2 weeks ago

Hi @mrtcode, I am back with some performance data as per your sugguestions.

These are the results I got from Firefox Performance tool when I scrolled up and down, zoomed in and zoomed out with the demo.pdf:

Screenshot From 2024-11-08 22-04-49

The first one is from reader without modification whilist the second one is from reader not rendering annotations. I added the following change to reader to avoid rendering annotations:

diff --git a/pdfjs/pdf.js b/pdfjs/pdf.js
--- a/pdfjs/pdf.js
+++ b/pdfjs/pdf.js
@@ -1 +1 @@
-Subproject commit 80dd98211fbcdc94ec3a4e7f045d19d069041b1c
+Subproject commit 80dd98211fbcdc94ec3a4e7f045d19d069041b1c-dirty
diff --git a/src/pdf/pdf-view.js b/src/pdf/pdf-view.js
index acbedf6..58113f3 100644
--- a/src/pdf/pdf-view.js
+++ b/src/pdf/pdf-view.js
@@ -612,6 +612,7 @@ class PDFView {

        _render(pageIndexes) {
+               return;
                for (let page of this._pages) {
                        if (!pageIndexes || pageIndexes.includes(page.pageIndex)) {
                                page.render();

What we can see from the results is that when rendering annotaions, it takes a lot of time in CanvasRenderingContext2D.drawImage, and I can feel the difference of smoothness between these cases.

In addition, I collected some data with perf for these cases either, which could be a proof of the ineffiency in the current implementation.

Rendering annotions:

Samples: 142K of event 'cpu_core/cycles/P', Event count (approx.): 105326414619                                                           
Overhead  Command          Shared Object                           Symbol                                                                 
  10.47%  CanvasRenderer   libc.so.6                               [.] __memmove_avx_unaligned_erms                                       
   8.64%  Isolated Web Co  libxul.so                               [.] 0x0000000003a7709a                                                 
   5.28%  Isolated Web Co  libxul.so                               [.] 0x0000000003a7709e                                                 
   1.98%  swapper          [kernel.kallsyms]                       [k] intel_idle                                                         
   1.79%  Isolated Web Co  libxul.so                               [.] 0x0000000003a770d4                                                 
   1.70%  Isolated Web Co  libc.so.6                               [.] __memmove_avx_unaligned_erms                                       
   1.68%  Isolated Web Co  libxul.so                               [.] 0x0000000003a770b2                                                 
   1.67%  Isolated Web Co  libxul.so                               [.] 0x0000000003a770dd                                                 
   1.63%  Isolated Web Co  libxul.so                               [.] 0x0000000003a770c3                                                 
   1.52%  Isolated Web Co  libxul.so                               [.] 0x0000000003a770a2                                                 
   1.00%  Renderer         libc.so.6                               [.] __memmove_avx_unaligned_erms                                       
   0.84%  Isolated Web Co  libxul.so                               [.] 0x0000000003a770c7                                                 
   0.81%  CanvasRenderer   [kernel.kallsyms]                       [k] sync_regs                                                          
   0.80%  Isolated Web Co  libxul.so                               [.] 0x0000000002071e21                                                 
   0.72%  CanvasRenderer   [kernel.kallsyms]                       [k] native_irq_return_iret                                             
   0.66%  CanvasRenderer   [kernel.kallsyms]                       [k] shmem_add_to_page_cache                                            
   0.65%  CanvasRenderer   [kernel.kallsyms]                       [k] __count_memcg_events                                               
   0.51%  Isolated Web Co  [JIT] tid 19681                         [.] 0x000017e20837853d                                                 
   0.51%  Isolated Web Co  [JIT] tid 19681                         [.] 0x000017e2083785cb                                                 
   0.49%  node             libnode.so.127                          [.] void v8::internal::MarkingVisitorBase<v8::internal::ConcurrentMarki
   0.48%  Isolated Web Co  [kernel.kallsyms]                       [k] shmem_add_to_page_cache                                            
   0.47%  CanvasRenderer   [kernel.kallsyms]                       [k] clear_page_erms                                                    
   0.47%  Isolated Web Co  [JIT] tid 19681                         [.] 0x000017e2083784d7                                                 
   0.47%  Isolated Web Co  [JIT] tid 19681                         [.] 0x000017e208378531                                                 
   0.46%  Isolated Web Co  [JIT] tid 19681                         [.] 0x000017e208378515                                                 
   0.46%  node             libnode.so.127                          [.] v8::internal::ConcurrentMarking::RunMajor(v8::JobDelegate*, v8::bas
   0.45%  Isolated Web Co  [JIT] tid 19681                         [.] 0x000017e20837849f

Not rendering annotations:

Samples: 138K of event 'cpu_core/cycles/P', Event count (approx.): 32891004912                                                            
Overhead  Command          Shared Object                            Symbol                                                                
   3.20%  swapper          [kernel.kallsyms]                        [k] intel_idle                                                        
   0.94%  gnome-shell      libc.so.6                                [.] __memmove_avx_unaligned_erms                                      
   0.60%  gnome-shell      libc.so.6                                [.] _int_malloc                                                       
   0.43%  Isolated Web Co  libc.so.6                                [.] pthread_mutex_lock@@GLIBC_2.2.5                                   
   0.37%  Isolated Web Co  firefox                                  [.] free                                                              
   0.34%  node             node                                     [.] Builtins_LdaNamedPropertyHandler                                  
   0.32%  TaskCon~ller #0  libxul.so                                [.] 0x00000000034a4683                                                
   0.31%  Isolated Web Co  libc.so.6                                [.] __GI___pthread_mutex_unlock_usercnt                               
   0.30%  swapper          [kernel.kallsyms]                        [k] menu_select                                                       
   0.28%  node             node                                     [.] v8::internal::Scanner::Next()                                     
   0.26%  DOM Worker       libc.so.6                                [.] pthread_mutex_lock@@GLIBC_2.2.5                                   
   0.26%  swapper          [kernel.kallsyms]                        [k] cpuidle_enter_state                                               
   0.25%  gnome-shell      libc.so.6                                [.] _int_free                                                         
   0.25%  firefox          libc.so.6                                [.] pthread_mutex_lock@@GLIBC_2.2.5                                   
   0.24%  WRScene~ilder#1  libc.so.6                                [.] pthread_mutex_lock@@GLIBC_2.2.5                                   
   0.24%  swapper          [kernel.kallsyms]                        [k] update_load_avg                                                   
   0.23%  gnome-shell      libglib-2.0.so.0.8200.2                  [.] g_hash_table_lookup                                               
   0.22%  DOM Worker       libc.so.6                                [.] __GI___pthread_mutex_unlock_usercnt                               
   0.22%  glean.dispatche  libxul.so                                [.] 0x00000000040c8e80                                                
   0.22%  gnome-shell      libc.so.6                                [.] __libc_calloc                                                     
   0.20%  gnome-shell      libglib-2.0.so.0.8200.2                  [.] g_pointer_bit_lock_and_get                                        
   0.20%  swapper          [kernel.kallsyms]                        [k] native_sched_clock                                                
   0.20%  swapper          [kernel.kallsyms]                        [k] sched_balance_update_blocked_averages                             
   0.19%  gnome-shell      [vdso]                                   [.] __vdso_clock_gettime                                              
   0.18%  swapper          [kernel.kallsyms]                        [k] _raw_spin_lock                                                    
   0.18%  gnome-shell      libgobject-2.0.so.0.8200.2               [.] g_type_check_instance_is_a                                        
   0.18%  gnome-shell      libc.so.6                                [.] malloc_consolidate

Overally, I think we can do better in the performance of Zotero reader. As a user heavily relying on Zotero, I'd like to have performance as good as possible and I am glad to participate in the improvement.

mrtcode commented 1 week ago

Thanks for investigating! I actually can’t reproduce any performance difference with the rendering function commented out or not. However, I agree that the function is relatively demanding and, if called too often, may cause issues on machines with limited graphics acceleration. In that case, you would likely notice lag when drawing ink annotations or moving/resizing other annotations as well.

I’ll look into reducing the number of calls to the function.

zotero / reader

PDF: Performance improvement #140