naia-lib / naia

a cross-platform (including Wasm!) networking library built in Rust. Intended to make multiplayer game development dead-simple & lightning-fast
Apache License 2.0
857 stars 61 forks source link

Bug: MessageChannels need hard capacity limits #165

Open connorcarpenter opened 1 year ago

connorcarpenter commented 1 year ago

We're seeing crashes when the # of items in a MessageChannel exceeds some limit. Likely having to do with the wrapping message index value overflowing before it can be recycled.

Case #1: https://gist.github.com/pbalcer/02b85eaa7b26e86478fcffef36fc7636

thread 'main' panicked at 'can't encode a negative number with an Unsigned Integer!', /home/piotrek/.cargo/registry/src/github.com-1ecc6299db9ec823/naia-serde-0.18.0/src/integer.rs:26:13
stack backtrace:
   0: rust_begin_unwind
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/panicking.rs:64:14
   2: naia_serde::integer::SerdeInteger<_,_,_>::new
             at /home/piotrek/.cargo/registry/src/github.com-1ecc6299db9ec823/naia-serde-0.18.0/src/integer.rs:26:13
   3: naia_shared::messages::channels::senders::indexed_message_writer::IndexedMessageWriter::write_message_index
             at /home/piotrek/.cargo/registry/src/github.com-1ecc6299db9ec823/naia-shared-0.21.0/src/messages/channels/senders/indexed_message_writer.rs:85:35
   4: naia_shared::messages::channels::senders::indexed_message_writer::IndexedMessageWriter::write_message
             at /home/piotrek/.cargo/registry/src/github.com-1ecc6299db9ec823/naia-shared-0.21.0/src/messages/channels/senders/indexed_message_writer.rs:101:9
   5: naia_shared::messages::channels::senders::indexed_message_writer::IndexedMessageWriter::write_messages
             at /home/piotrek/.cargo/registry/src/github.com-1ecc6299db9ec823/naia-shared-0.21.0/src/messages/channels/senders/indexed_message_writer.rs:34:13
   6: <naia_shared::messages::channels::senders::reliable_sender::ReliableSender<naia_shared::messages::message_container::MessageContainer> as naia_shared::messages::channels::senders::channel_sender::MessageChannelSender>::write_messages
             at /home/piotrek/.cargo/registry/src/github.com-1ecc6299db9ec823/naia-shared-0.21.0/src/messages/channels/senders/reliable_sender.rs:136:9
   7: naia_shared::messages::message_manager::MessageManager::write_messages
             at /home/piotrek/.cargo/registry/src/github.com-1ecc6299db9ec823/naia-shared-0.21.0/src/messages/message_manager.rs:249:17
             ...

This seems to be an issue with too many messages being put in an OrderedMessageChannel at a time. The MessageIndex u16 indicates an upper limit of a single channel would be 2^15 (32768) or less.

Case 2:

thread 'main' panicked at 'Write overflow!', /home/piotrek/.cargo/registry/src/github.com-1ecc6299db9ec823/naia-serde-0.18.0/src/bit_writer.rs:92:13
stack backtrace:
   0: rust_begin_unwind
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/panicking.rs:64:14
   2: <naia_serde::bit_writer::BitWriter as naia_serde::bit_writer::BitWrite>::write_bit
   3: naia_client::connection::channel_tick_buffer_sender::ChannelTickBufferSender::write_message
   4: naia_client::connection::channel_tick_buffer_sender::ChannelTickBufferSender::write_messages
   5: naia_client::connection::tick_buffer_sender::TickBufferSender::write_messages
   6: naia_client::connection::connection::Connection<E>::send_outgoing_packets
   7: naia_client::client::Client<E>::receive
   8: bevy_ecs::world::World::resource_scope

This one seems to have been caused by Client->Server ping of ~7 seconds. In which case, the redundant message sending every tick probably flooded the channel past the ~400 byte limit. However, much of that code already has assert checks the whole way while writing the packet to make sure it doesn't overflow the limit (or throws a helpful error when it does) so how this got through is curious..

dycoon commented 11 months ago

When I tried to implement a game that spawns a large number of NPCs, I got the following error.

thread 'main' panicked at 'Write overflow!', **home**\.cargo\registry\src\github.com-1ecc6299db9ec823\naia-serde-0.18.0\src\bit_writer.rs:92:13
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library\std\src\panicking.rs:575
   1: core::panicking::panic_fmt
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library\core\src\panicking.rs:64
   2: naia_serde::bit_writer::impl$1::write_bit
             at  **home**\.cargo\registry\src\github.com-1ecc6299db9ec823\naia-serde-0.18.0\src\bit_writer.rs:92
   3: naia_shared::world::host::host_world_writer::HostWorldWriter::write_updates
             at  **home**\.cargo\registry\src\github.com-1ecc6299db9ec823\naia-shared-0.21.0\src\world\host\host_world_writer.rs:452
   4: naia_shared::world::host::host_world_writer::HostWorldWriter::write_into_packet<bevy_ecs::entity::Entity,naia_bevy_shared::world_proxy::WorldRef>
             at **home**\.cargo\registry\src\github.com-1ecc6299db9ec823\naia-shared-0.21.0\src\world\host\host_world_writer.rs:48
   5: naia_server::connection::connection::Connection<bevy_ecs::entity::Entity>::send_outgoing_packets<bevy_ecs::entity::Entity,naia_bevy_shared::world_proxy::WorldRef>
             at **home**\.cargo\registry\src\github.com-1ecc6299db9ec823\naia-server-0.21.0\src\connection\connection.rs:179
   6: naia_server::server::Server<bevy_ecs::entity::Entity>::send_all_updates<bevy_ecs::entity::Entity,naia_bevy_shared::world_proxy::WorldRef>
             at **home**\.cargo\registry\src\github.com-1ecc6299db9ec823\naia-server-0.21.0\src\server.rs:343
   7: naia_bevy_server::systems::before_receive_events::closure$0
             at **home**\.cargo\registry\src\github.com-1ecc6299db9ec823\naia-bevy-server-0.21.0\src\systems.rs:178
   8: bevy_ecs::world::World::resource_scope<naia_server::server::Server<bevy_ecs::entity::Entity>,tuple$<>,naia_bevy_server::systems::before_receive_events::closure_env$0>
...

I think you should write the finish bit after the releasing bit. otherwise, we will run out of buffers. I avoid this error as below.

https://github.com/naia-lib/naia/commit/a2edc813599d23acf061022bed00f23426850821

However, when I attempted to apply the same change to all place of writer.release_bits(1), I encountered another 'Write overflow!' error. I'm not sure what the correct fix is, but I hope this information helps in resolving the issue.