ValveSoftware / GameNetworkingSockets

Reliable & unreliable messages over UDP. Robust message fragmentation & reassembly. P2P networking / NAT traversal. Encryption.
BSD 3-Clause "New" or "Revised" License
8.3k stars 621 forks source link

Q: Performance warning #307

Open angelskieglazki opened 10 months ago

angelskieglazki commented 10 months ago

Hello! Are these messages normal? It seems to me that I am doing something wrong.

...
  0.000000 Waited 2.2ms for SteamNetworkingSockets lock
  0.000000 SteamNetworkingSockets lock held for 6.4ms.  (Performance warning.)  ServiceThread,CSteamNetworkConnectionBase::Think(x2),SNP_SendPacket(x2),SendUDPacket(x2)
This is usually a symptom of a general performance problem such as thread starvation.
  0.000000 SteamNetworkingSockets lock held for 6.8ms.  (Performance warning.)  ServiceThread,RecvUDPPacket,CSteamNetworkConnectionBase::Think,SNP_SendPacket,SendUDPacket
This is usually a symptom of a general performance problem such as thread starvation.
  0.000000 Waited 3.7ms for SteamNetworkingSockets lock
  0.000000 Waited 2.4ms for SteamNetworkingSockets lock
  ...
  0.000000 SteamNetworkingSockets lock held for 5.7ms.  (Performance warning.)  ServiceThread,CSteamNetworkConnectionBase::Think(x9),SNP_SendPacket(x18),SendUDPacket(x18)
This is usually a symptom of a general performance problem such as thread starvation
  0.000000 SteamNetworkingSockets lock held for 8.8ms.  (Performance warning.)  ServiceThread
This is usually a symptom of a general performance problem such as thread starvation.
  0.000000 SteamNetworkingSockets lock held for 9.1ms.  (Performance warning.)  SendMessages
This is usually a symptom of a general performance problem such as thread starvation.
  0.000000 SteamNetworkingSockets lock held for 32.5ms.  (Performance warning.)  ServiceThread,CSteamNetworkConnectionBase::Think(x8),SNP_SendPacket(x8),SendUDPacket(x8)
This is usually a symptom of a general performance problem such as thread starvation.
  0.000000 SteamNetworkingSockets lock held for 32.9ms.  (Performance warning.)  SendMessages
This is usually a symptom of a general performance problem such as thread starvation.
  0.000000 SteamNetworkingSockets lock held for 20.7ms.  (Performance warning.)  ServiceThread
This is usually a symptom of a general performance problem such as thread starvation.
  0.000000 SteamNetworkingSockets lock held for 20.8ms.  (Performance warning.)  SendMessages
This is usually a symptom of a general performance problem such as thread starvation
...

I’ll describe a little how my game server works, there are two threads from which the GNS api is called.

1) Server network thread.

void server_network_thread() {
  while (Is_running_) {
    auto till = clock->nowMilliseconds() + THREAD_SLEEP_DELAY;
    processIncomingMessages();
    pollConnectionStateChanges();
    sendMessages(0);
    std::this_thread::sleep_until(till);
  }
}
 processIncomingMessages(); - contains reading messages one by one using the function ReceiveMessagesOnPollGroup(...);
 pollConnectionStateChanges(); - calling SteamNetworkingSockets()->RunCallbacks();
 sendMessages(0); - generates messages from the send queue and sends them through the function SteamNetworkingSockets()->SendMessages(...);
 THREAD_SLEEP_DELAY = 1ms or 5ms, does not affect the final result

so this thread is running and then we will get a message about creating a game room:

2) Room thread - a little more complicated, typical game loop:

class RoomTicker {
public:
  RoomTicker() = default;
  ~RoomTicker() = default;
  void inc() { ++currentTick_; }
  int64_t currentTick() const { return currentTick_;}
  int64_t currentTickTime() const { return (int64_t)((double)(currentTick_ * 1000) / TICKS_PER_SECOND); }
  static constexpr float deltaTime() { return 1000. / TICKS_PER_SECOND; }
  static constexpr double TICKS_PER_SECOND = 40.;
private:
  int64_t currentTick_ = 0;
};

void gameroom_thread() {
  room_start_time = clock->nowMilliseconds();

  while (Is_running_) {
    int64_t room_time = clock->nowMilliseconds() - room_start_time;
    readAllMessages();
    processAllMessages();

    int passed_ticks = 0;
    for (; ticker.currentTickTime() <= room_time; ticker.inc()) {
      ++passed_ticks;
      {
        // SOME GAME LOGIC
        // SOME GAME LOGIC
        // SOME GAME LOGIC
      }
      sendMessages(ticker_.currentTick());
    }

    if (passed_ticks <= 1) {
      const auto till = std::chrono::time_point<std::chrono::system_clock>(
                std::chrono::milliseconds(room_start_time + ticker.currentTickTime()));
      std::this_thread::sleep_until(till);
    }
  }
}
readAllMessages(); - here I read as many messages from the socket as I can. I set the constant to 200 messages. I read the messages, put them in a recvQueue, and so on until SteamNetworkingSockets()->ReceiveMessagesOnPollGroup(roomPollGroup, messages_array, 200); will not return 0.
processAllMessages - process all commands and messages that were received
// SOME GAME LOGIC - game logic. We create messages and fill the sending queue.
sendMessages(ticker_.currentTick()); -  send all messages from sendQueue via  SteamNetworkingSockets()->SendMessages(...);

Сlients send reliable messages with positions about 50 times per second. Starting from 30+ players, lags begin and warnings begin to spam. I started thinking and this is what I came up with: if 30 players are connected and they send 50 messages per second, then 1500 times per second I call GNS-api. Since I am not calling it from a network thread, GNS-mutexes will be locked there, which is what I see (0.000000 Waited 2.4ms for SteamNetworkingSockets lock ... and so on). Perhaps I should send and receive messages in only one network thread, then the thread-contention will be minimal and the wait on the GNS mutexes will also be minimal. And the room threads will communicate with the network thread through queues, which I will somehow exchange under my mutex.

In general, I want to consult with you :)

regards, max

Arr-n-D commented 3 months ago

Hey, just my opinion but your second approach would be better suited imo.

I'm doing something similar in Go with the bindings

I create one network thread whose only job is to send and receive data and keep track of connections, and the game loop thread, and I communicate between the two of them through channels @angelskieglazki. I think it ends up being more performant that way

angelskieglazki commented 2 months ago

Hey, just my opinion but your second approach would be better suited imo.

I'm doing something similar in Go with the bindings

I create one network thread whose only job is to send and receive data and keep track of connections, and the game loop thread, and I communicate between the two of them through channels @angelskieglazki. I think it ends up being more performant that way

We fix some errors in go-wrapper for GNS.. go-wrapper work fine. https://github.com/angelskieglazki/gns

My research in C++ continues. :)

Arr-n-D commented 2 months ago

We fix some errors in go-wrapper for GNS.. go-wrapper work fine. https://github.com/angelskieglazki/gns

I've forked your version and using my own gns bindings now, as I have more control over every aspect

zpostfacto commented 2 months ago

If you are accessing the same connection from many different threads at a time, you will probably get a lot of contention. In general, sending and receiving messages does not take the global lock, but it does take the lock of whatever connection or poll group you are accessing.

You might also try sending many messages to the API at once, using SendMessages.

It looks to me that you are not necessarily getting contention, it's just that the service thread is taking a while to service all of the connections for some reason. If you have any tools for profiling, I would be interested to know what the cause is. If you only have a few dozen connections, it should be able to service them efficiently.

You might double check that you are not compiling the library in debug, or doing something that might cause SNP_PARANOIA to be active. This adds quite a bit of CPU time.

angelskieglazki commented 2 months ago

I've forked your version and using my own gns bindings now, as I have more control over every aspect

can I take a look at your wrapper?