element-hq / element-desktop

A glossy Matrix collaboration client for desktop.
https://element.io
GNU Affero General Public License v3.0
1.14k stars 261 forks source link

Seshat/Tantivy reliably pegs CPU at 100-200% for 30s at a time #1444

Open ara4n opened 8 months ago

ara4n commented 8 months ago

Steps to reproduce

  1. Have a largeish Seshat db. mine is currently:

Screenshot 2024-01-18 at 11 14 24

(i've deliberately increased the time between spidering here to try to diagnose this bug fwiw)

  1. Notice that Element Desktop often takes up at least 1 or 2 full cores of CPU for 30s+ at a time.
  2. Hit it with the Time Profiler from Instruments
  3. Discover that the CPU usage across a typical minute looks like this:

Screenshot 2024-01-18 at 12 06 39

  1. Select the 100% CPU region and see the stacktrace looks like this:
3.40 s  100.0%  0 s     Element Nightly (2951)
2.19 s   64.1%  0 s      std::sys::unix::thread::Thread::new::thread_start::h9b6324e2391e6ebb  0x1b1e8
2.19 s   64.1%  0 s       thread_start
2.19 s   64.1%  0 s        _pthread_start
2.19 s   64.1%  0 s         std::sys::unix::thread::Thread::new::thread_start::h9b6324e2391e6ebb
2.19 s   64.1%  0 s          core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h741ea4d19a1102e9
2.19 s   64.1%  0 s           std::sys_common::backtrace::__rust_begin_short_backtrace::hecaa5c76d653b8cf
2.19 s   64.1%  0 s            tantivy::reader::IndexReaderBuilder::try_into::_$u7b$$u7b$closure$u7d$$u7d$::h497c12f293fdd8b8
2.19 s   64.1%  0 s             tantivy::reader::InnerIndexReader::reload::hf6658e53b17c3056
2.19 s   64.1%  0 s              core::iter::adapters::try_process::hdbdf36f41c503559
2.19 s   64.1%  0 s               _$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$alloc..vec..spec_from_iter..SpecFromIter$LT$T$C$I$GT$$GT$::from_iter::h30d94bc0f53d2dad
2.19 s   64.1%  0 s                tantivy::core::segment_reader::SegmentReader::open::h0158d958192b51d2
2.19 s   64.1%  0 s                 tantivy::core::segment::Segment::open_read::hb4f0da0f2f029037
2.19 s   64.1%  0 s                  _$LT$tantivy..directory..managed_directory..ManagedDirectory$u20$as$u20$tantivy..directory..directory..Directory$GT$::open_read::h5db96201b9f49fdb
2.19 s   64.1%  0 s                   _$LT$seshat..index..encrypted_dir..EncryptedMmapDirectory$u20$as$u20$tantivy..directory..directory..Directory$GT$::open_read::h42ec2dbb447c8193
1.74 s   51.1%  0 s                    std::io::default_read_to_end::hab9667ee5d776f02
1.68 s   49.4%  15.00 ms                        _$LT$ctr..Ctr$LT$B$C$F$GT$$u20$as$u20$cipher..stream..StreamCipher$GT$::try_apply_keystream::h91942f4fd8495eda
1.67 s   49.0%  395.00 ms                        aes::soft::fixslice::aes256_encrypt::h15a9ee76a8b09532
654.00 ms   19.2%   654.00 ms                         aes::soft::fixslice::sub_bytes::h3ebf5d1c400c28f6
273.00 ms    8.0%   273.00 ms                         aes::soft::fixslice::mix_columns_1::hce570626dbdc1d73
169.00 ms    4.9%   169.00 ms                         aes::soft::fixslice::mix_columns_3::he81ff5b2305632c2
142.00 ms    4.1%   142.00 ms                         aes::soft::fixslice::inv_bitslice::h20d5de3d52723233
36.00 ms    1.0%    36.00 ms                          aes::soft::fixslice::bitslice::h91afa83a5e227b6f
27.00 ms    0.7%    0 s                     alloc::raw_vec::RawVec$LT$T$C$A$GT$::reserve::do_reserve_and_handle::he12456f79aea9ef9
17.00 ms    0.4%    17.00 ms                        __bzero
11.00 ms    0.3%    11.00 ms                        _platform_memmove
3.00 ms    0.0% 3.00 ms                     aes::soft::fixslice::aes256_encrypt::h15a9ee76a8b09532
443.00 ms   13.0%   0 s                    seshat::index::encrypted_stream::AesReader$LT$D$C$R$GT$::new::h598e903d2be65da2
1.00 ms    0.0% 0 s                    alloc::sync::Arc$LT$T$C$A$GT$::drop_slow::h6508adcc92bef317
1.21 s   35.6%  0 s      std::sys::unix::thread::Thread::new::thread_start::h9b6324e2391e6ebb  0x1b14a
1.21 s   35.6%  0 s       thread_start
1.21 s   35.6%  0 s        _pthread_start
1.21 s   35.6%  0 s         std::sys::unix::thread::Thread::new::thread_start::h9b6324e2391e6ebb
1.21 s   35.6%  0 s          core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h741ea4d19a1102e9
1.21 s   35.6%  0 s           std::sys_common::backtrace::__rust_begin_short_backtrace::hecaa5c76d653b8cf
1.21 s   35.6%  0 s            tantivy::reader::IndexReaderBuilder::try_into::_$u7b$$u7b$closure$u7d$$u7d$::h497c12f293fdd8b8
1.21 s   35.6%  0 s             tantivy::reader::InnerIndexReader::reload::hf6658e53b17c3056
1.21 s   35.5%  0 s              core::iter::adapters::try_process::hdbdf36f41c503559
1.21 s   35.5%  0 s               _$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$alloc..vec..spec_from_iter..SpecFromIter$LT$T$C$I$GT$$GT$::from_iter::h30d94bc0f53d2dad
1.21 s   35.5%  0 s                tantivy::core::segment_reader::SegmentReader::open::h0158d958192b51d2
1.21 s   35.5%  0 s                 tantivy::core::segment::Segment::open_read::hb4f0da0f2f029037
1.21 s   35.5%  0 s                  _$LT$tantivy..directory..managed_directory..ManagedDirectory$u20$as$u20$tantivy..directory..directory..Directory$GT$::open_read::h5db96201b9f49fdb
1.21 s   35.5%  0 s                   _$LT$seshat..index..encrypted_dir..EncryptedMmapDirectory$u20$as$u20$tantivy..directory..directory..Directory$GT$::open_read::h42ec2dbb447c8193
951.00 ms   27.9%   0 s                    std::io::default_read_to_end::hab9667ee5d776f02
922.00 ms   27.0%   3.00 ms                     _$LT$ctr..Ctr$LT$B$C$F$GT$$u20$as$u20$cipher..stream..StreamCipher$GT$::try_apply_keystream::h91942f4fd8495eda
919.00 ms   26.9%   196.00 ms                        aes::soft::fixslice::aes256_encrypt::h15a9ee76a8b09532
390.00 ms   11.4%   390.00 ms                         aes::soft::fixslice::sub_bytes::h3ebf5d1c400c28f6
143.00 ms    4.1%   143.00 ms                         aes::soft::fixslice::mix_columns_1::hce570626dbdc1d73
93.00 ms    2.7%    93.00 ms                          aes::soft::fixslice::mix_columns_3::he81ff5b2305632c2
72.00 ms    2.1%    72.00 ms                          aes::soft::fixslice::inv_bitslice::h20d5de3d52723233
25.00 ms    0.7%    25.00 ms                          aes::soft::fixslice::bitslice::h91afa83a5e227b6f
15.00 ms    0.4%    0 s                     alloc::raw_vec::RawVec$LT$T$C$A$GT$::reserve::do_reserve_and_handle::he12456f79aea9ef9
7.00 ms    0.2% 7.00 ms                     __bzero
5.00 ms    0.1% 5.00 ms                     _platform_memmove
2.00 ms    0.0% 2.00 ms                     aes::soft::fixslice::aes256_encrypt::h15a9ee76a8b09532
255.00 ms    7.4%   0 s                    seshat::index::encrypted_stream::AesReader$LT$D$C$R$GT$::new::h598e903d2be65da2
3.00 ms    0.0% 0 s                    alloc::sync::Arc$LT$T$C$A$GT$::drop_slow::h6508adcc92bef317
5.00 ms    0.1% 0 s              tantivy::reader::pool::Pool$LT$T$GT$::publish_new_generation::h19afee9627aac713
3.00 ms    0.0% 0 s      Main Thread  0x68a4
2.00 ms    0.0% 0 s      std::sys::unix::thread::Thread::new::thread_start::h9b6324e2391e6ebb  0x6a46

(yay for having good Rust symbols!) - so, TL;DR:

Then, looking at the 200% CPU regime of the trace:

4.99 s  100.0%  0 s     Element Nightly (2951)
2.50 s   50.0%  0 s      std::sys::unix::thread::Thread::new::thread_start::h9b6324e2391e6ebb  0x1b14b
2.50 s   50.0%  0 s       thread_start
2.50 s   50.0%  0 s        _pthread_start
2.50 s   50.0%  0 s         std::sys::unix::thread::Thread::new::thread_start::h9b6324e2391e6ebb
2.50 s   50.0%  0 s          core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h741ea4d19a1102e9
2.50 s   50.0%  0 s           std::sys_common::backtrace::__rust_begin_short_backtrace::hecaa5c76d653b8cf
2.50 s   50.0%  0 s            tantivy::reader::IndexReaderBuilder::try_into::_$u7b$$u7b$closure$u7d$$u7d$::h497c12f293fdd8b8
2.50 s   50.0%  0 s             tantivy::reader::InnerIndexReader::reload::hf6658e53b17c3056
2.50 s   50.0%  0 s              core::iter::adapters::try_process::hdbdf36f41c503559
2.50 s   50.0%  0 s               _$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$alloc..vec..spec_from_iter..SpecFromIter$LT$T$C$I$GT$$GT$::from_iter::h30d94bc0f53d2dad
2.50 s   50.0%  0 s                tantivy::core::segment_reader::SegmentReader::open::h0158d958192b51d2
2.50 s   50.0%  0 s                 tantivy::core::segment::Segment::open_read::hb4f0da0f2f029037
2.50 s   50.0%  0 s                  _$LT$tantivy..directory..managed_directory..ManagedDirectory$u20$as$u20$tantivy..directory..directory..Directory$GT$::open_read::h5db96201b9f49fdb
2.50 s   50.0%  0 s                   _$LT$seshat..index..encrypted_dir..EncryptedMmapDirectory$u20$as$u20$tantivy..directory..directory..Directory$GT$::open_read::h42ec2dbb447c8193
1.92 s   38.5%  0 s                    std::io::default_read_to_end::hab9667ee5d776f02
1.85 s   37.1%  18.00 ms                        _$LT$ctr..Ctr$LT$B$C$F$GT$$u20$as$u20$cipher..stream..StreamCipher$GT$::try_apply_keystream::h91942f4fd8495eda
1.83 s   36.7%  387.00 ms                        aes::soft::fixslice::aes256_encrypt::h15a9ee76a8b09532
774.00 ms   15.5%   774.00 ms                         aes::soft::fixslice::sub_bytes::h3ebf5d1c400c28f6
305.00 ms    6.1%   305.00 ms                         aes::soft::fixslice::mix_columns_1::hce570626dbdc1d73
205.00 ms    4.1%   205.00 ms                         aes::soft::fixslice::mix_columns_3::he81ff5b2305632c2
122.00 ms    2.4%   122.00 ms                         aes::soft::fixslice::inv_bitslice::h20d5de3d52723233
40.00 ms    0.8%    40.00 ms                          aes::soft::fixslice::bitslice::h91afa83a5e227b6f
35.00 ms    0.7%    0 s                     alloc::raw_vec::RawVec$LT$T$C$A$GT$::reserve::do_reserve_and_handle::he12456f79aea9ef9
22.00 ms    0.4%    22.00 ms                        __bzero
11.00 ms    0.2%    11.00 ms                        _platform_memmove
2.00 ms    0.0% 2.00 ms                     aes::soft::fixslice::aes256_encrypt::h15a9ee76a8b09532
574.00 ms   11.5%   0 s                    seshat::index::encrypted_stream::AesReader$LT$D$C$R$GT$::new::h598e903d2be65da2
1.00 ms    0.0% 0 s                    alloc::sync::Arc$LT$T$C$A$GT$::drop_slow::h6508adcc92bef317
2.49 s   49.9%  0 s      std::sys::unix::thread::Thread::new::thread_start::h9b6324e2391e6ebb  0x6a46
2.49 s   49.9%  0 s       thread_start
2.49 s   49.9%  0 s        _pthread_start
2.49 s   49.9%  0 s         std::sys::unix::thread::Thread::new::thread_start::h9b6324e2391e6ebb
2.49 s   49.9%  0 s          core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hf94e6711939e6a58
2.49 s   49.9%  0 s           std::sys_common::backtrace::__rust_begin_short_backtrace::h198dea04d92a1a00
2.49 s   49.9%  0 s            futures_executor::thread_pool::PoolState::work::h005abcf5c588804e
2.49 s   49.9%  0 s             tantivy::indexer::segment_updater::SegmentUpdater::schedule_future::_$u7b$$u7b$closure$u7d$$u7d$::h55e6ae8e23b05acb
2.49 s   49.9%  0 s              tantivy::indexer::segment_updater::SegmentUpdater::purge_deletes::h5bb46f2332f19a49
2.49 s   49.9%  0 s               tantivy::indexer::index_writer::advance_deletes::hf08b911b4cfe7a17
2.49 s   49.8%  0 s                tantivy::core::segment_reader::SegmentReader::open::h0158d958192b51d2
2.49 s   49.8%  0 s                 tantivy::core::segment::Segment::open_read::hb4f0da0f2f029037
2.49 s   49.8%  0 s                  _$LT$tantivy..directory..managed_directory..ManagedDirectory$u20$as$u20$tantivy..directory..directory..Directory$GT$::open_read::h5db96201b9f49fdb
2.49 s   49.8%  0 s                   _$LT$seshat..index..encrypted_dir..EncryptedMmapDirectory$u20$as$u20$tantivy..directory..directory..Directory$GT$::open_read::h42ec2dbb447c8193
1.95 s   39.1%  0 s                    std::io::default_read_to_end::hab9667ee5d776f02
1.88 s   37.6%  13.00 ms                        _$LT$ctr..Ctr$LT$B$C$F$GT$$u20$as$u20$cipher..stream..StreamCipher$GT$::try_apply_keystream::h91942f4fd8495eda
1.87 s   37.4%  424.00 ms                        aes::soft::fixslice::aes256_encrypt::h15a9ee76a8b09532
753.00 ms   15.0%   753.00 ms                         aes::soft::fixslice::sub_bytes::h3ebf5d1c400c28f6
296.00 ms    5.9%   296.00 ms                         aes::soft::fixslice::mix_columns_1::hce570626dbdc1d73
186.00 ms    3.7%   186.00 ms                         aes::soft::fixslice::mix_columns_3::he81ff5b2305632c2
160.00 ms    3.2%   160.00 ms                         aes::soft::fixslice::inv_bitslice::h20d5de3d52723233
48.00 ms    0.9%    48.00 ms                          aes::soft::fixslice::bitslice::h91afa83a5e227b6f
32.00 ms    0.6%    0 s                     alloc::raw_vec::RawVec$LT$T$C$A$GT$::reserve::do_reserve_and_handle::he12456f79aea9ef9
27.00 ms    0.5%    27.00 ms                        __bzero
13.00 ms    0.2%    13.00 ms                        _platform_memmove
531.00 ms   10.6%   0 s                    seshat::index::encrypted_stream::AesReader$LT$D$C$R$GT$::new::h598e903d2be65da2
3.00 ms    0.0% 0 s                    _$LT$tantivy..directory..mmap_directory..MmapDirectory$u20$as$u20$tantivy..directory..directory..Directory$GT$::open_read::h5969c15b0ca03867
1.00 ms    0.0% 0 s                    alloc::sync::Arc$LT$T$C$A$GT$::drop_slow::h6508adcc92bef317
3.00 ms    0.0% 0 s                core::ptr::drop_in_place$LT$tantivy..core..segment_reader..SegmentReader$GT$::hba27e152b36916b9
1.00 ms    0.0% 0 s      Main Thread  0x68a4

So, TL;DR: at this point, two different threads are chewing 100% of a core each, again inside tantivy::reader::IndexReaderBuilder. Full Instruments .trace file available on request.

So: how come my EDR is spending 30s reading the encrypted tantivy index when I'm not doing searches, and when i've told the indexer to only run once every 100s?

And please can I get my CPU (and battery) back? :)

I think this has been doing this since forever, but has got increasingly worse as the index has grown.

Outcome

What did you expect?

EDR shouldn't use 100-200% CPU when idle.

What happened instead?

Seshat is going rogue and chewing huge amounts of CPU.

Operating system

macOS 14.2.1

Application version

Element Nightly version: 2024011801 Crypto version: Rust SDK 0.7.0 (746c5db), Vodozemac 0.5.0

How did you install the app?

No response

Homeserver

matrix.org

Will you send logs?

Yes

ara4n commented 8 months ago

(a related question: how come tantivy hasn't caught up with indexing the missing 2 rooms? it should be caught up at any given time, especially as previously the indexer was running every 1000ms before i upped it to every 100,000ms. Or is every single message added to the store taking 30s of AES?!)

dosubot[bot] commented 8 months ago

🤖

Hey @ara4n, good to see you again. Hope you're doing well despite this CPU usage you're experiencing. Let's get this sorted out.

Based on the information you've provided, it seems like the high CPU usage is related to the indexing process of Seshat/Tantivy. The indexing process can be CPU-intensive, especially when dealing with large amounts of data.

One potential solution to reduce CPU usage is to disable hardware acceleration. This can be done by setting the "disableHardwareAcceleration" setting to true in the global store. Here's how you can do it:

Settings["Electron.enableHardwareAcceleration"].write(false);

This code is located in the src/settings.ts file.

However, please note that this might not completely solve the high CPU usage issue as it could be due to other factors not covered in the provided context.

Regarding the utilization of Seshat/Tantivy for indexing and searching, the Element Desktop app uses Seshat, a library for indexing and searching Matrix events, to provide efficient search functionality. The app first checks if Seshat is supported and installed. If it is, the app imports the necessary classes and sets a flag indicating that event indexing is supported. When the initEventIndex command is received, the app creates a new Seshat index at a specified path. The addEventToIndex command adds a new event to the index. The searchEventIndex command searches the index for a given query. If an error occurs while initializing the index, the app attempts to recover by reindexing the database and reopening it. You can find more details in the src/seshat.ts file.

If the problem persists, it would be advisable to provide more detailed information about your system and the specific conditions under which the high CPU usage occurs. This will help in further diagnosing the issue.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.