pterodactyl / panel

Pterodactyl® is a free, open-source game server management panel built with PHP, React, and Go. Designed with security in mind, Pterodactyl runs all game servers in isolated Docker containers while exposing a beautiful and intuitive UI to end users.
https://pterodactyl.io
Other
6.6k stars 1.65k forks source link

Servers don't start after wings update #3836

Closed OneHitX closed 2 years ago

OneHitX commented 2 years ago

A note from the maintainer:

READ THIS BEFORE COMMENTING ⚠️

I, @DaneEveritt, need far more information from people adding on to this issue. Please always provide your OS and Browser information, especially when it involves a browser component, there are far too many possible combinations out there for me to waste my time digging in with the wrong setup.

Additionally, please provide the output of wings diagnostics. What looks "normal" to you may not actually be normal, and it allows me to look for patterns between everyone's reports. Without any of this information I can't do anything about this bug, and it will just continue to exist.


Is there an existing issue for this?

Current Behavior

After upgrading to wings 1.5.3 (last release), nodes with large numbers of 50+ servers do not start containers for some reason. There is no error on the part of the configuration, everything has been checked, resources like CPU, RAM, Disk and network are normal.

image

Wings show: [Pterodactyl Daemon]: Finished pulling Docker container image

And nothing happens, the server stays this way until the wings are restarted and a new start request is called.

Below is the node "glances", apparently ok image

Expected Behavior

The server is installed with the paperspigot egg and with Java 17, the correct action would be for the wings to start the container and turn on the server.

Steps to Reproduce

Console --> start --> wait for the server to start

Panel Version

1.6.6

Wings Version

1.5.3

Error Logs

Wings: https://ptero.co/duvonixary
Panel: http://bin.ptdl.co/urd7g
Software-Noob commented 2 years ago

The number of servers doesn't play a part in this. I've had this happen with only one server in a dev env. Do you see the container in docker ps, and did you resolve this by restarting wings or docker?

The issue is with wings websocket not sending any data from my previous testing.

OneHitX commented 2 years ago

The number of servers doesn't play a part in this. I've had this happen with only one server in a dev env. Do you see the container in docker ps, and did you resolve this by restarting wings or docker?

The issue is with wings websocket not sending any data from my previous testing.

Container does not start, issue is resolved after restarting wings. But it's happening randomly on servers.

mrxbox98 commented 2 years ago

This is only happening on my arm64 nodes. Others have the same issue on amd nodes.

mrxbox98 commented 2 years ago

I found the fix. I was not using the arm64 yolks for the docker image. The arm64 images are at https://github.com/Software-Noob/pterodactyl-images/pkgs/container/arm64 Edit: This is unrelated to the bug on this thread.

OneHitX commented 2 years ago

This is only happening on my arm64 nodes.

I don't use arm processors, the problem happens on NODES with AMD Epyc and Xeon.

Software-Noob commented 2 years ago

I found the fix. I was not using the arm64 yolks for the docker image. The arm64 images are at Software-Noob/pterodactyl-images/pkgs/container/arm64

The Yolks java, node, go and python images are multi-arch. My repo was created before that, and is not really of use anymore. This issue can indeed occur when you mix up the arch images, but this is a different issue.


It's difficult to find the cause of it without reproducible steps. I've had it happen at random a couple of times, with random images and servers. Please do let us know if you manage to find more information to narrow it down.

hannesfant commented 2 years ago

I encountered this same issue a few days ago and started digging a bit deeper. It happens on an Intel based node as well, so it's definitely not isolated to arm archs.

I've only seen it happen with Parker's d.js and d.py images (albeit I haven't tested any other of his) - however it doesn't happen with the stock paper egg for some reason.

It looks like the container is being created, evaluates the script as usual, and then exits (as expected) however the panel doesn't seem to catch the fact that it has in fact been created.

I created a test JS file with the following code to verify that it is in fact being executed:

console.log(Date.now())
setTimeout(() => {
    console.log(Date.now())
}, 5000)

If we then start the container through the panel, and continuously check the running containers we can see that the container in question actually does start, and then exits a few seconds later, as expected.

Furthermore, inspecting the logs, we can see that the script in question actually does run:

root@mars ~ # docker logs dd3f7dd12c15
v16.13.1
:/home/container$ if [[ -d .git ]] && [[ ${AUTO_UPDATE} == "1" ]]; then git pull; fi; if [[ ! -z ${NODE_PACKAGES} ]]; then /usr/local/bin/npm install ${NODE_PACKAGES}; fi; if [[ ! -z ${UNNODE_PACKAGES} ]]; then /usr/local/bin/npm uninstall 
${UNNODE_PACKAGES}; fi; if [ -f /home/container/package.json ]; then /usr/local/bin/npm install; fi; /usr/local/bin/node /home/container/${BOT_JS_FILE}
1641591830820
1641591835832

Nevertheless, to the panel, the server is still stuck in the starting stage, it hasn't even caught the fact that the container doesn't exist anymore (which should trigger it crashing or the state updating to offline), and nothing has been reported in the console:

However, let's try changing the aforementioned script to the following, to prevent it from exiting on its own:

console.log(Date.now())
setInterval(() => {
    console.log(Date.now())
}, 5000)

If we now start the server again, and check the running containers, we'll see that the container is indeed running as expected, even minutes later. However the panel console is still not displaying any of our time logs. Reloading the page fixes this temporarily, it loads all the log entries so far, though it doesn't fetch any new entries, until we reload the page once again.

In addition to all of this, the server doesn't get marked as "online" even if we print one of the egg keywords that should trigger daemon completion of the server (eg "Started") to the console.

Also, restarting Wings does seem to fix it temporarily. I'm not quite sure if this triggers some kind of panel refresh of sort maybe?

I'm not quite sure if this helped at all, or if it was already well known information, but my best guess is that it might be the panel that's causing the issues, especially considering simply reloading the page fetches the newest logs. I don't know too much of the entire architecture of Pterodactyl though, so I can be completely wrong :)

mrxbox98 commented 2 years ago

Replying to https://github.com/pterodactyl/panel/issues/3836#issuecomment-1007772242

Does your panel or wings report any errors in error logs?

hannesfant commented 2 years ago

Replying to https://github.com/pterodactyl/panel/issues/3836#issuecomment-1007820692

Neither of them report anything out of the ordinary

hannesfant commented 2 years ago

Also, to add on to this, it was working fine for weeks on another node but I just tried starting a server, it didn't start, as expected, and after restarting due to the "crash" it filled the console with newlines and now it's suffering from the same issue as described earlier in this issue. No idea if that helps any, but figured I'd drop it here in case it helps narrowing it down in any way

JackCrispy commented 2 years ago

Same issues, running panel v: 1.6.5, wings v: 1.5.3. Docker image is ghcr.io/pterodactyl/yolks:java_8 for me, just trying to run spigot.

Also worked fine for me up until latest update, there are a lot of other users facing the same issues in support server & a couple creating issues on gh. image

mrxbox98 commented 2 years ago

Replying to https://github.com/pterodactyl/panel/issues/3836#issuecomment-1007923111

I also had the same error but I was able to find some error messages when I was recording it. image This happened on a non-official docker image for java 17 openj9. I have not tested it with the official java 17 image.

Software-Noob commented 2 years ago

This happened on a non-official docker image for java 17 openj9. I have not tested it with the official java 17 image.

This is unrelated to this issue. Your exec error is caused by the image not supporting ARM64 (will support it once they fix it upstream).

DaneEveritt commented 2 years ago

I need far more information from people adding on to this issue. Please always provide your OS and Browser information, especially when it involves a browser component, there are far too many possible combinations out there for me to waste my time digging in with the wrong setup.

Additionally, please provide the output of wings diagnostics. What looks "normal" to you may not actually be normal, and it allows me to look for patterns between everyone's reports. Without any of this information I can't do anything about this bug, and it will just continue to exist.

CPU4 commented 2 years ago

Having the same issue with my panel right now. Panel/Wings is being run on Ubuntu Server 20.4 and the browser I am trying to start the server on is Google Chrome. Here is my wings diagnostics: https://ptero.co/zowocuvoly

Jerlag01 commented 2 years ago

Replying to https://github.com/pterodactyl/panel/issues/3836#issuecomment-1007772242

I'm using intel cpu's myself, never had this issue, not even now

Jerlag01 commented 2 years ago

other updates like? If they came from spigot themselves then it's third party, did you check the docker image being pulled right now in difference to the previous one?

CPU4 commented 2 years ago

Created I completely new node to see if that could resolve the issue as it seems that some people on this thread only have it on certain nodes, but the issue persists for me. Panel/Wings is being run on Ubuntu Server 20.4 and the browser I am trying to start the server on is Google Chrome. As you can see it's the same problem as other people are having on this thread. New wings diagnostic: https://ptero.co/amuzawosac image

DaneEveritt commented 2 years ago

Going to consider this related to the other bugs around console processing issues we were seeing. For now it is easier for me to close this out, mostly because I cannot for the life of me manage to reproduce this even with the specific examples provided above.

Once 1.5.7 is released let's come back around and open a new issue if this persists.

edit: I was actually able to reproduce this with an arbitrary delay in the code between starting and attaching, which we've reversed in the new code so that specific pathway is impossible now. I think this is therefore resolved.