EpicGamesExt / PixelStreamingInfrastructure

The official Pixel Streaming servers and frontend.
MIT License
253 stars 97 forks source link

[BUG] - Log directory first time creation can crash server instance. #322

Open ariffammobox opened 1 week ago

ariffammobox commented 1 week ago

UE Version: UE 5.3

Frontend Version: UE5.3

Problem component Signalling Server

Description We have 2 signalling server being started as systemd service on ubuntu machine with 4 sec delay respectively causing one of server to crash because of logging.js error when attempting to create logs directory. Restarting failed server would result in normal behavior.

node:internal/fs/utils:350
    throw err;
    ^

Error: EEXIST: file already exists, mkdir './logs/'
    at Object.mkdirSync (node:fs:1398:3)
    at Object.RegisterFileLogger (/home/ubuntu/Project/Linux/ProjectSamples/PixelStreaming/WebServers/SignallingWebServer/modules/logging.js:67:6)
    at Object.<anonymous> (/home/ubuntu/Project/Linux/Project/Samples/PixelStreaming/WebServers/SignallingWebServer/cirrus.js:46:10)
    at Module._compile (node:internal/modules/cjs/loader:1256:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1310:10)
    at Module.load (node:internal/modules/cjs/loader:1119:32)
    at Module._load (node:internal/modules/cjs/loader:960:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)
    at node:internal/main/run_main_module:23:47 {
  errno: -17,
  syscall: 'mkdir',
  code: 'EEXIST',
  path: './logs/'
}

Steps to Reproduce:

  1. Start 2 signalling server with or without delays on machine startup, the first time (when logs folder not yet created) will always result in a signalling server crashing.

Expected behavior

  1. Multiple signalling servers able to run on a single machine the first time it launches.

Platform (please complete the following information):

mcottontensor commented 5 days ago

I've been trying to repro this locally but have been unable. I'm a little curious about how this is happening. You say the first instance of the SS crashes? Or do you mean when there is no logs directory the pair of SSs will result in a crash (one crashes? they both crash)? What happens if you try to only start one instance through systemd? Does it crash if you try to run it manually? The code should try to create the directory if it doesn't exist and since it's complaining that the directory exists already I'm a little confused. I could just catch the error but I wonder if it's actually just indicative of a larger issue.

ariffammobox commented 4 days ago

When starting one instance through systemd manually, it doesn't crash.

I believe this happen as the services are being started on boot the first time on a fresh EC2 instance.

We have two services, Server 1 & Server 2. Server 1 starts after 4 seconds delay while server 2 starts after 8 seconds. After some time seems like only Server 1 will crash.

[Unit]
Description=Server 1 Service
After=network.target

[Service]
Type=simple
ExecStartPre=/bin/sleep 4
ExecStart=/platform_scripts/bash/Start_WithTURN_SignallingServer.sh --configFile server-1.json
User=ubuntu
Group=ubuntu
Restart=on-failure
StandardOutput=append:/service-logs/server-1.log
StandardError=append:/service-logs/server-1-error.log

[Install]
WantedBy=multi-user.target

Simply restarting the service via systemctl restart server-1 would just start normally as the logs folder already been created. We're confused as well as this only happens the very first time a fresh EC2 instance is launched.