matrix-org / matrix-rust-sdk

Matrix Client-Server SDK for Rust
Apache License 2.0
1.21k stars 240 forks source link

Session wasn't created nor shared #3440

Open douglaz opened 4 months ago

douglaz commented 4 months ago

Testing against both local/remote synapse instances under high load (many clients running in different tasks, and each client is shared between two different tasks):

thread 'tokio-runtime-worker' panicked at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/matrix-sdk-crypto-0.7.1/src/session_manager/group_sessions.rs:209:54:
Session wasn't created nor shared
stack backtrace:
   0: rust_begin_unwind
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/std/src/panicking.rs:647:5
   1: core::panicking::panic_fmt
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/panicking.rs:72:14
   2: core::panicking::panic_display
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/panicking.rs:196:5
   3: core::panicking::panic_str
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/panicking.rs:171:5
   4: core::option::expect_failed
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/option.rs:1988:5
   5: core::option::Option<T>::expect
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/option.rs:894:21
   6: {async_fn#0}
             at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/matrix-sdk-crypto-0.7.1/src/session_manager/group_sessions.rs:209:13
   7: {async_fn#0}
             at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/matrix-sdk-crypto-0.7.1/src/machine.rs:897:80
   8: {async_block#0}
             at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/matrix-sdk-0.7.1/src/room/futures.rs:187:26
   9: poll<matrix_sdk::room::futures::{impl#3}::into_future::{async_block_env#0}>
             at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9
  10: poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=core::result::Result<ruma_client_api::message::send_message_event::v3::Response, matrix_sdk::error::Error>> + core::marker::Send), alloc::alloc::Global>>
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
  11: {async_block#0}
             at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/matrix-sdk-0.7.1/src/room/futures.rs:95:78
  12: poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=core::result::Result<ruma_client_api::message::send_message_event::v3::Response, matrix_sdk::error::Error>> + core::marker::Send), alloc::alloc::Global>>
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9

I don't have right now a self contained project. I may try to create one if this gets difficult to reproduce/debug.

Hywan commented 4 months ago

I suspect this is a problem with the crypto store. We need to investigate. How many clients do you have? What's your use case? How often does it happen?

douglaz commented 4 months ago

How many clients do you have?

10+ clients on the same rust process

What's your use case?

We're developing a load test tool for matrix servers so we can properly setup them for a given expected number of simultaneous users

How often does it happen?

It always happens on our tests, but it make take a few minutes depending on the server size

Hywan commented 4 months ago

Question: How hard would it be for you to write an integration test inside the Matrix Rust SDK, so that it helps us to reproduce the problem?

poljar commented 4 months ago

I don't think that this is a bug, that panic happens if you didn't exchange room keys and attempt to encrypt and send a message to a room, please take a look at the, sadly unfinished, tutorial for the crypto crate: https://github.com/matrix-org/matrix-rust-sdk/pull/2352/files#diff-c90ec6ed5dae76e5da6ae0abcf6f48a10effbb532b3b093e04ea69fcd03e61c8R852-R906

If you want to develop a load test tool I would suggest you to use the matrix-sdk crate instead of the matrix-sdk-crypto crate.

The matrix-sdk crate is a higher level crate which will allow you to send encrypted messages without having to care about room keys and how to send them to other devices.

douglaz commented 4 months ago

Question: How hard would it be for you to write an integration test inside the Matrix Rust SDK, so that it helps us to reproduce the problem?

I'll try that as soon as I get some spare time.

douglaz commented 4 months ago

If you want to develop a load test tool I would suggest you to use the matrix-sdk crate instead of the matrix-sdk-crypto crate.

The matrix-sdk crate is a higher level crate which will allow you to send encrypted messages without having to care about room keys and how to send them to other devices.

I'm using matrix-sdk, look at the beginning of stacktrace (/home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/matrix-sdk-0.7.1/src/room/futures.rs:95:78)

poljar commented 4 months ago

Oh indeed, sorry about the confusion.

Care to share how your Client setup looks, what type of storage you are using and how your EncryptionSettings look like.