UCL-VR / ubiq

Other
98 stars 34 forks source link

Max user capacity #9

Closed Glokta0 closed 3 years ago

Glokta0 commented 3 years ago

Hi there! After reading through (almost) the entire docs, I got to say, I really like the idea and documentation for this! It's a great project!

I'm looking for a platform dynamic enough to host a number of user and a number of interactable shared objects. Are there any known limitations or guesses how many VR-user we could hook together?

Thanks for creating and sharing this! Best, Florian

sebjf commented 3 years ago

Hi @Glokta0, thank you for your feedback! And I apologise for the very late reply (I'll need to check our notification settings)

(Despite the late reply here!) Ubiq is under active development, and taking initial steps towards scalability is our next milestone. Milestone one was getting something working in the hands of users, so there is some technical debt to be resolved, and some features to be added around this. I've put some thoughts below on how different implementation details may affect scalability:

Audio

The voice chat is P2P, so each user has one connection to every other room member. Each connection must have its own encoder and transmission resources. Additionally, clients must combine remote streams. An equivalent implementation is Mozilla Hubs, which uses the same technology (WebRtc) and supports about 15-20 concurrent users on desktops, after which the audio is the first system to give-out. I would expect Ubiq would be simliar.

Since this limit is due to PC resources there isn't much that can be done to increase the number of local voice channels (perhaps share encoded audio packets etc, but not much more). Rather, systems would need to use a different architecture for audio, such as an MCU or SFU. This isn't a purely client side fix, but is possible because the audio architecture is decoupled from regular messaging. Alternatively, clients could implement a focus/nimbus-like system so only N peers have audio connections.

Avatars

Right now, Avatars are the most expensive components in terms of bandwidth, but are still relatively cheap (50-100 KB/s/peer). I would expect the first issues would actually be to do with draw calls, as each avatar is a separate object. A medium scale system (50-100 of peers with complex scenery), would probably need some sort of avatar super-system that drew batched avatar models. How significant a draw call is is hard to say as it depends on the environment, but personally I try to keep the count in the very low hundreds on a Quest.

At some point the bandwidth will become non-trivial but there is more scope to minimise this than for audio. ATM the avatars just send as fast as they can as they are usually the only objects in the scene. The actual packet sizes are small (~100 bytes) (they send the head/hands and perform IK locally). A large number (100) could be supported at 60 Hz without broaching 1 Mb/s. The rate could be reduced a lot as well using simple predictors.

Server

Ubiq is inherently P2P but practically most applications will use the server for NAT reasons. The server acts as a relay so could be a bottleneck for large groups. Running tests with large numbers of bots is a high priority for our current milestone, but I dont have numbers to give you yet. We expect the potential user count to be quite high (100's, at least) as NodeJs underlies a lot of big services.

One nice thing about the server arrangement to bear in mind is that in Ubiq the network does the fanout, which means for the typical server arrangement, the server does the fanout. This has implications for the number of supported peers because it means the outgoing bandwdith for many components (excepting, eg. audio) doesn't scale at all with the number of peers.

Routing

The biggest potential bottleneck I'd predict right now will be message Id matching, both at the server (where its done in JS rather than native code), and at the client (which has a more complex routing function based on a dictionary and for loops). This is the biggest of technical debt and again mainly needs to be stress tested before I can propose any numbers.

Other Thoughts

Objects

The bandwidth of individual objects is up to those object's developers. Something to bear in mind is that JSON encoding helpers are provided but you can use your own serialisation (or blitting, if possible).

Rooms

The rooms system is our biggest focus right now, but its implementation shouldn't really impact scalability as its bandwidth and processing overhead is all to do with its properties dictionary, which is very bursty. Rather, the improved rooms system will be about partitioning so that larger spaces can be shared without a single client having to deal with too many peers, even if there are many more in the near distance.

TLDR

To answer your question exactly, I'd guess a maxiumum of 20 peers in a room, limited by the P2P audio, with twice that number again shared objects, if you use a simple model such as the example Firework, limited by the routing and actual rendering performance for those objects.

I'll post back with the resutls of stress tests as we run them and any further thoughts if they occur!

Kind regards,

Sebastian