Closed rehashedsalt closed 4 months ago
Subservers blowing up if the pipe doesn't work should be ok, it's vital to them actually working lol
Why does it output Cannot assign requested address
when the address exists/gets assigned? I'm not a Linux user, so I have no information about what makes this happen.
Subservers blowing up if the pipe doesn't work should be ok, it's vital to them actually working lol
The master is blowing up here, to be clear -- that's not an exception thrown by the subserver. Or at least I don't believe it is as it shows up too fast.
Why does it output
Cannot assign requested address
when the address exists/gets assigned? I'm not a Linux user, so I have no information about what makes this happen.
Socket creation is just a filesystem operation, but socket connection is a separate OS syscall, so their presence isn't a representation of the actual network state.
Worth nothing that these are what's called Unix Domain Sockets -- TL;DR: they're on the filesystem instead of operating over IP, and as such have slightly different behavior. This is relevant because "Cannot assign requested address" is the human-readable string for EADDRNOTAVAIL
, but that error is only for IP sockets.
It might imply that the .NET pipe is misconfigured somehow.
If it helps any, I was able to get a full stacktrace for that error:
[14:21:58.269] [.NET ThreadPool Worker/WARN] [tML]: Silently Caught Exception:
System.Net.Internals.SocketExceptionFactory+ExtendedSocketException: Cannot assign requested address /tmp/CoreFxPipe_StarsAbove_Observatory_IN
at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
at System.Net.Sockets.Socket.Connect(EndPoint remoteEP)
at System.IO.Pipes.NamedPipeClientStream.TryConnect(Int32 timeout, CancellationToken cancellationToken)
at System.IO.Pipes.NamedPipeClientStream.ConnectInternal(Int32 timeout, CancellationToken cancellationToken, Int32 startTime)
at System.IO.Pipes.NamedPipeClientStream.Connect(Int32 timeout)
at SubworldLibrary.SubserverLink.ConnectAndProcessQueue()
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
at System.Threading.ThreadPoolWorkQueue.Dispatch()
at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
at System.Threading.Thread.StartCallback()
The fact that this matches #34's stacktrace and that .
is very definitely a legal character for any filename on Linux, including a socket, I'd wager that this is the same issue.
If you'd like a test environment to play with, just let me know -- I can replicate my setup on an isolated Hetzner box for you.
cc @aleforge
After some back-and-forth with AleForge, they're thinking it's a wide Linux-affecting bug:
The main problem with this issue is that I think there are some differences in the mod's code that only work with the Windows version of DotNet that the mod author hasn't addressed.
Me: Have any of your users had success using this mod? Them: Only when they do not visit a subworld.
I was also able to replicate this issue using just the client and the bundled server. Start client -> Multiplayer -> Host -> Warp to subworld via TSA -> Same crash, logs in ~/.local/share/Steam/steamapps/common/tModLoader/tModLoader-Logs/Old
.
The master is blowing up here, to be clear -- that's not an exception thrown by the subserver. Or at least I don't believe it is as it shows up too fast.
Oh right, missed the Process.Start
. Still sort of applies to the main server though; should've added this a while ago but I'll make it forcefully return the clients to the main world if the pipe goes down.
"Cannot assign requested address" is the human-readable string for EADDRNOTAVAIL, but that error is only for IP sockets.
It's frustrating how little documentation I can find on how this stuff works! Nothing seems to be out of the ordinary with the pipe configuration.
It's frustrating how little documentation I can find on how this stuff works!
Absolutely! I tried poring over the .NET documentation for that pipe for a while and came up with nothing at all. No nuances into how the pipe functions across operating systems, no source for the reference implementation... Hell, it'd be great if the stacktrace threw anything more than "it broke".
Nothing seems to be out of the ordinary with the pipe configuration.
Yeah, you're not passing fancy args into the constructor or doing anything silly with it, so I'm not sure what the deal is.
Once the holidays have cooled down, I might see about getting a dev environment up to do some testing, but it's been years now since I've touched tML.
yeah got the same issue
Tried having a look at this issue as well. I know fairly little about C# so take my guesstimates with a grain of salt.
The way NamedPipeClientStream
and NamedPipeServerStream
are utilized for the "[...]_IN" pipe seems a bit odd to me.
From my limited understanding/ guesstimations NamedPipeServerStream
will offer a pipe for a NamedPipeClientStream
to attach to. So logically, you would want your NamedPipeServerStream
to be awaiting a connection before a NamedPipeClientStream
tries to attach. In the case of the "[...]_IN" pipe the NamedPipeClientStream
seems to try to connect before a NamedPipeServerStream
is spun up, resulting in the error message documented up above. My guess is the Windows implementation somehow doesn't care about the exact order of operations whereas the Linux implementation blows up.
I tried correcting this by simply swapping the NamedPipeClientStream
and NamedPipeServerStream
in the [...]_IN case, which seemingly fixes the error messages. I still didn't get the subworld to work (this is with Stars Above), though that might just be another mod incomatibility/ unrelated issue with the modpack I am running.
Another huge oversight that prevents looking into this issue further is the fact that when a subworld is spun up its process will take over the server.log archiving the one the parent process was writing to in the process which prevents looking into log messages produced by the parent server.
Should you need a linux server to run tests on I would be open to help out, otherwise there is always the option of running a linux container using Docker for Windows.
Another huge oversight that prevents looking into this issue further is the fact that when a subworld is spun up its process will take over the server.log archiving the one the parent process was writing to in the process which prevents looking into log messages produced by the parent server.
The bug should absolutely be fixed, but just for insight's sake, you can access the old server logs by looking at file descriptors. /proc/$PID/fd/{0,1,2}
will still hold the stale FD of the old log file on the master server. You can less
them or whatever to view the overwritten log.
@rehashedsalt thanks for the tip. I can now confirm that my guess is correct, since there are no more error messages produced in either logs. In conclusion the order of operations should be:
NamedPipeServerStream
instances in the parent process, start threads that call WaitForConnection
. (Or possibly use BeginWaitForConnection
? Probably the better choice if it works the way I think it works.)NamedPipeClientStream
instances on subworld process, call Connect
on them.This should ensure the pipes the subworld process tries to connect to are available at the appropriate time.
My suggested fix doesn't actually work, my test was flawed since #40 caused mods to not actually load on the subserver.
God damn it, there were no resources mentioning that this blows up specifically on Linux. Thank you so much for figuring this out! Will close this issue with the release of 2.2.2.2.
Issue Summary
Possibly related to #32, #33, #34. Using Subworld Lib v2.2.2. My execution environment is built from this repo and is a Docker container with no special privileges. Port 7777 is forwarded out.
Upon attempting to use the Spatial Disk to warp to the Observatory Hyperborea in TSA, I get this trace from
server.log
, after which my client is stuck on "Saving Map Data: 100%" and does not proceed to the subworld:Console mentions this:
The pipe mentioned does exist:
From what it seems to me based on the timeline, the pipe is created, but the host server immediately assumes the subworld server will be able to handle connections the moment the process forks, gets a conn refused, and blows up. The time between
Process.Start()
forking and the worker thread exploding is only a few milliseconds.Enabled Mods