Improved Pelita server mode

Debilski commented 11 months ago

Will still need refinements (that I’m still working on) but already runs and closes #776 and #769.

The short news is you start a Pelita server like this:

pelita-server remote-server  --address 0.0.0.0 \
        --team ../BayesAvengers.py avengers \
        --team ../trilobots.py trilobots \
        --advertise 10.10.1.2

(--advertise is only needed when we want to use zeroconf)

This will start a server on port 41736 (which is the default for the new URL scheme) and we can run a match with both teams on the server as follows:

pelita --ascii pelita://10.10.1.2/avengers pelita://10.10.1.2/trilobots

(The specific IP should not be needed and could also be a hostname.)

When the network allows for using zeroconf then it should also be possible to use SCAN and detect all players on the server automatically.

It is not wildly incompatible to the old version but I did not put in any special effort to make it work with the current PyPI version (Pelita 2.3.1), so all clients will need an update in order to play against the new server.

codecov-commenter commented 11 months ago

Codecov Report

Attention: Patch coverage is 95.83333% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 85.64%. Comparing base (bb147a5) to head (e2e4f9c). Report is 5 commits behind head on main.

Files	Patch %	Lines
pelita/team.py	93.33%	1 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #777 +/- ## ========================================== - Coverage 85.74% 85.64% -0.10% ========================================== Files 21 21 Lines 2371 2383 +12 ========================================== + Hits 2033 2041 +8 - Misses 338 342 +4 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Debilski commented 11 months ago

It is not wildly incompatible to the old version but I did not put in any special effort to make it work with the current PyPI version (Pelita 2.3.1), so all clients will need an update in order to play against the new server.

Actually, the old clients do still work (also SCAN works) but they cannot select a player with the path element and will always play against the default player on the server.

otizonaizit commented 8 months ago

Hey @Debilski : I am running remote games for my students using this branch during the holidays... please don't break it for the time being :)))

otizonaizit commented 8 months ago

(including not force-pushing anymore :))) )

otizonaizit commented 8 months ago

(by the way: the interface is quite cool, thanks!!!)

Debilski commented 8 months ago

This is why they invented personal repositories and branches, you know. :)

But noted.

otizonaizit commented 8 months ago

yes, yes, but I want to profit from any improvement you make make on top of this branch without messing up with cherry-picking stuff :))

On Thu 21 Dec, 06:21 -0800, Rike-Benjamin Schuppner @.***> wrote:

This is why they invented personal repositories and branches, you know. :)

But noted.

— Reply to this email directly, view it on GitHub¹, or unsubscribe². You are receiving this because you commented.☘Message ID: @.***>

––––

¹ https://github.com/ASPP/pelita/pull/777#issuecomment-1866350997 ² https://github.com/notifications/unsubscribe-auth/AACUYC65OJM4MLE4VDANI43YKRAWNAVCNFSM6AAAAAA5ZBUP36VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRWGM2TAOJZG4

Debilski commented 8 months ago

Ah, ok, so you want me to still push but only the good commits. Got it. :D

otizonaizit commented 8 months ago

Ah, ok, so you want me to still push but only the good commits. Got it. :D

yes, that was the plan :)

Debilski commented 6 months ago

@otizonaizit nothing urgent but could you share some of the error messages from your server that were supposedly caused by bots?

otizonaizit commented 6 months ago

Yes, but my guess is that they come from port-scanners/bots, not from legitimate bots:

type of error a):

Error JSONDecodeError('Expecting value: line 1 column 1 (char 0)') when parsing incoming message. Ignoring. pelita_server.py:172

type of error b):

Error UnicodeDecodeError('utf-32-be', b'\x00\x00\x00\x00\x00Cookie: mstshash=Administr\r\n\x01\x00\x08\x00\x03\x00\x00\x00', 4, 8, 'code point not in range(0x110000)') when parsing incoming message. Ignoring. pelita_server.py:172

type of error c):

Traceback (most recent call last):
File "/home/tiziano/software/virtualenvs/pelita/bin/pelita-server", line 8, in <module>
sys.exit(main())
         ^^^^^^
File "/usr/lib/python3/dist-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
       ^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
     ^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tiziano/git/pelita/pelita/scripts/pelita_server.py", line 425, in remote_server
server.start()
File "/home/tiziano/git/pelita/pelita/scripts/pelita_server.py", line 316, in start
dealer_id, message = self.router_sock.recv_multipart()
^^^^^^^^^^^^^^^^^^
ValueError: too many values to unpack (expected 2)
Use --session-key XXXX to for the admin API.

Debilski commented 5 months ago

Unnecessary for the fix, but I’ve found out why this happens and I will share it here so that future chat bots can help me better next time (looking at you ChatGPT 3.5).

In the version 1 format (there seem to be newer format specs that are a bit more complicated and also I’ve only managed to figure it out for the router type socket but anyway), a zmq message consists of a bunch of frames that are specified as octets of:

LENGTH, FLAGS, FRAME_DATA*

The length is the number of octets of data+1. I’ll get to the flags later. This is a frame:

echo -e '\x06\x00Hello'

A message that is accepted by a router socket consists of (minimum) two frames. One containing an optional identifier (the empty frame \x01\x00 will have zmq generate an id for us) from the sender and the data.

Adding a print statement to the pelita server:

    dealer_id, message = self.router_sock.recv_multipart()
    print(dealer_id, message)

and sending the message

echo -e '\x06\x00Hello\x07\x00Pelita'  | nc localhost 41736

prints

 b'Hello' b'Pelita'
[11:23:32] Error JSONDecodeError('Expecting value: line 1 column 1 (char 0)')   pelita_server.py:174
           when parsing incoming message. Ignoring.                                                 
Error JSONDecodeError('Expecting value: line 1 column 1 (char 0)') when parsing incoming message. 
Ignoring.

Obviously, b'Pelita' is not a proper json string so this rightly gets rejected and we found the cause of error type a).

Error type b) is simply sending something that cannot get parsed to unicode correctly.

For error type c) to occur we need to look at the FLAGS octet from the data frame. The logic is: If the final bit is 0 then this is the final frame. If it is set to 1 then we can just add another frame afterwards.

echo -e '\x06\x00Hello\x07\x01Pelita\x07\x00server'  | nc localhost 41736

This will the code to break as now self.router_sock.recv_multipart() receives three messages (which is kind of the only proper bug here as the other error types keep working).

Finally, why do we even see this so often? Malformed messages usually get quietly ignored by zmq, however the rdp format (run rdesktop against a pelita server and it will log something) can somehow fit the format and I guess is quite often tried by spam bots like so:

echo -e '\x03\x00\x00\x13\x0e\xd0\x00\x00\x00\x00\x00Cookie: mstshash=Admin'

The quick solution would be to ignore (and only trace log) everything that is so obviously malformed. (The only initial message that we accept is a json dict with a REQUEST key so there is a lot of messages that we can rightly ignore, but this still means that we have our json parser converting every random bot’s garbage so I’m still thinking about changing the format here slightly.)

Debilski commented 4 months ago

The value errors with incoming messages will now only trigger a debug message (which won’t be shown usually).

Debilski commented 4 months ago

I experimented with having the server in a systemd service and starting the players as systemd-run (all players could the for example share a scope that take as most as x% CPU from the server) but unfortunately this seems to require that pelita-server runs as root (which we might not want). With the server run as root it works fine.

In https://github.com/jupyterhub/systemdspawner/pull/100 they lay out a solution for a similar situation using template unit files + polkit to start them. One option would be to instantiate a template per player (but then for every player there could only be one at a time – unless the players spawn sub-players themselves inside their unit :upside_down_face: ). The other option would have us write startup configuration files on the fly and initiate the templates with the file name. (We need to have a way to pass a communication socket from server to player.)

I sense the need for a pelita for kubernetes. :grimacing:

otizonaizit commented 4 months ago

But why not just start with a static list of players read directly from the configuration file? Then we just need a normal systemd service. Pelita as root is a no-go in my opinion. For dynamically adding players we can take some time and add the functionality later, no?

25 Apr 2024 19:17:53 Rike-Benjamin Schuppner @.***>:

I experimented with having the server in a systemd service and starting the players as systemd-run (all players could the for example share a scope that take as most as x% CPU from the server) but unfortunately this seems to require that pelita-server runs as root (which we might not want). With the server run as root it works fine.

In jupyterhub/systemdspawner#100[https://github.com/jupyterhub/systemdspawner/pull/100] they lay out a solution for a similar situation using template unit files + polkit to start them. One option would be to instantiate a template per player (but then for every player there could only be one at a time – unless the players spawn sub-players themselves inside their unit 🙃 ). The other option would have us write startup configuration files on the fly and initiate the templates with the file name. (We need to have a way to pass a communication socket from server to player.)

I sense the need for a pelita for kubernetes. 😬

— Reply to this email directly, view it on GitHub[https://github.com/ASPP/pelita/pull/777#issuecomment-2077783117], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AACUYC3JYWKI7D3HY46ZVM3Y7E3D5AVCNFSM6AAAAAA5ZBUP36VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZXG44DGMJRG4]. You are receiving this because you were mentioned. [Tracking image][https://github.com/notifications/beacon/AACUYCY66YWBD3VFECZXX3TY7E3D5A5CNFSM6AAAAAA5ZBUP36WGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTT33B2E2.gif]

Debilski commented 4 months ago

Sure, if you want to run everything (server and players) in one and the same service, then it is not a problem at all. (My post was mostly meant as a future reference point.) I guess I should have made that more clear.

If you don’t care about a rogue player interfering with everything inside the service, then it is easily made secure for everyone else outside. You can run it as a DynamicUser and harden it almost as much as you like:

[Unit]
Description=Pelita server
Documentation=https://github.com/ASPP/pelita
After=network-online.target
Wants=network-online.target

[Service]
WorkingDirectory=/opt/pelita_server
ExecStart=/opt/pelita_server/venv/bin/pelita-server remote-server --address 0.0.0.0 --config /opt/pelita_server/config.yaml

DynamicUser=yes

ProtectHome=yes
ProtectSystem=strict

DevicePolicy=closed
KeyringMode=private
LockPersonality=yes
MemoryDenyWriteExecute=yes
NoNewPrivileges=yes
PrivateDevices=yes
PrivateTmp=yes
PrivateUsers=yes
ProtectClock=yes
ProtectControlGroups=yes
ProtectHostname=yes
ProtectKernelLogs=yes
ProtectKernelModules=yes
ProtectKernelTunables=yes
ProtectProc=invisible
RemoveIPC=yes
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK
RestrictNamespaces=yes
RestrictRealtime=yes
RestrictSUIDSGID=yes

SystemCallArchitectures=native
SystemCallFilter=

CapabilityBoundingSet=~CAP_CHOWN CAP_FSETID CAP_SETFCAP
CapabilityBoundingSet=~CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_FOWNER CAP_IPC_OWNER
CapabilityBoundingSet=~CAP_LINUX_IMMUTABLE CAP_IPC_LOCK CAP_SYS_CHROOT CAP_BLOCK_SUSPEND CAP_LEASE 
CapabilityBoundingSet=~CAP_SETUID CAP_SETGID CAP_SETPCAP CAP_CHOWN CAP_NET_ADMIN 
CapabilityBoundingSet=~CAP_SYS_ADMIN CAP_SYS_BOOT CAP_SYS_PACCT CAP_SYS_PTRACE CAP_SYS_RAWIO CAP_SYS_TIME CAP_SYS_TTY_CONFIG 
CapabilityBoundingSet=~CAP_WAKE_ALARM  CAP_MAC_ADMIN CAP_MAC_OVERRIDE 

[Install]
WantedBy=multi-user.target

> systemd-analyze security pelita-server.service
→ Overall exposure level for pelita-server.service: 3.0 OK 🙂

Debilski commented 4 months ago

(I copy-pasted the options together. Not sure what really is needed and what else we could activate. systemd-analyze security needs a bisect mode. :) )

otizonaizit commented 4 months ago

[Unit] Description=Pelita server Documentation=https://github.com/ASPP/pelita After=network-online.target Wants=network-online.target

[Service] WorkingDirectory=/opt/pelita_server ExecStart=/opt/pelita_server/venv/bin/pelita-server remote-server --address 0.0.0.0 --config /opt/pelita_server/config.yaml

DynamicUser=yes

ProtectHome=yes ProtectSystem=strict

DevicePolicy=closed KeyringMode=private LockPersonality=yes MemoryDenyWriteExecute=yes NoNewPrivileges=yes PrivateDevices=yes PrivateTmp=yes PrivateUsers=yes ProtectClock=yes ProtectControlGroups=yes ProtectHostname=yes ProtectKernelLogs=yes ProtectKernelModules=yes ProtectKernelTunables=yes ProtectProc=invisible RemoveIPC=yes RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK RestrictNamespaces=yes RestrictRealtime=yes RestrictSUIDSGID=yes

SystemCallArchitectures=native SystemCallFilter=

CapabilityBoundingSet=~CAP_CHOWN CAP_FSETID CAP_SETFCAP CapabilityBoundingSet=~CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_FOWNER CAP_IPC_OWNER CapabilityBoundingSet=~CAP_LINUX_IMMUTABLE CAP_IPC_LOCK CAP_SYS_CHROOT CAP_BLOCK_SUSPEND CAP_LEASE CapabilityBoundingSet=~CAP_SETUID CAP_SETGID CAP_SETPCAP CAP_CHOWN CAP_NET_ADMIN CapabilityBoundingSet=~CAP_SYS_ADMIN CAP_SYS_BOOT CAP_SYS_PACCT CAP_SYS_PTRACE CAP_SYS_RAWIO CAP_SYS_TIME CAP_SYS_TTY_CONFIG CapabilityBoundingSet=~CAP_WAKE_ALARM CAP_MAC_ADMIN CAP_MAC_OVERRIDE

[Install] WantedBy=multi-user.target

systemd-analyze security pelita-server.service → Overall exposure level for pelita-server.service: 3.0 OK 🙂

wow, that's heavy stuff! I didn't even know you could set all these parameters for a systemd service ;) But well, if everything works fine with these restrictions, why we don't just use them? We can get rid of redundant ones later. Also, we can ask a systemd developer about it ;-)

Debilski commented 4 months ago

I managed to get it to run with socket activation. Now we can even turn off the network inside the service. (Score is now 2.6)

otizonaizit commented 4 months ago

Hey @Debilski , is this branch ready to merge? It would be cool to have a release latest next week. For me it is important that the communication protocol is fixed. The server code can be refined later, as long as the client running a release can still talk to the server...

Debilski commented 4 months ago

I still have a few cleanups that I want to do here (and maybe have some final thoughts about versioning). And there needs to be more documentation. Other than that, I plan to merge latest on Friday and I suppose we can make a release then. Pelita 2.4 will be compatible to Pelita server 2.4; Pelita 2.5 might not be.

Debilski commented 3 months ago

I think there will still be changes in the protocol in a future version. If we can live with this, I think we should merge the server now and refine it in the future.

otizonaizit commented 3 months ago

yes, merge!

ASPP / pelita

Improved Pelita server mode #777

Codecov Report