ASPP / pelita

Actor-based Toolkit for Interactive Language Education in Python
https://github.com/ASPP/pelita_template
Other
63 stars 68 forks source link

Make bot implementations that are network-controlled illegal during the tournament #524

Open otizonaizit opened 6 years ago

otizonaizit commented 6 years ago

A group of students came up with a bot implementation that talks over the network to a separate machine, and receives commands from there. This is relatively easy to implement, but it is quite hard to implement right: there are all sort of problems with synchronization between the keyboard on the remote machine and the server listening on the bot side. The group using this technique did not succeed to implement a bot that works well, but we should nonetheless make this illegal. For one, it makes it impossible to re-run the same game (human input is lost, IP-addresses change), and second is arguably "cheating".

How to do this properly is not clear, though, and requires some thought.

Debilski commented 6 years ago

I guess the go-to solution would be to run the clients in a container (Docker/systemd) without network access. We could whitelist the ports that are needed for zmq but if it is on the same machine, we could also use a file socket. It might already be possible to run it that way (with remote:tcp:// as a spec) but I am open for more ideas.

otizonaizit commented 6 years ago

I am not happy about adding a docker/systemd dependency to the setup for a tournament. This means that at most one person in the faculty knows how to run the tournament ;)

What about coming up with a simple shell script using ss or netstat to detect outgoing network connections during the test-runs before the tournament?

This does not need to be 100% bullet-proof. And I guess they will ask us if they can do it before attempting it, so it is really just to be sure that no one is trying really hard to cheat.

Debilski commented 6 years ago

I’d then suggest turning off the network as a solution that everybody can understand. ;)

Do we really need to script it then just to find out whether they are trying to cheat? We could also run all players in CI and if a team is much better during tournament than in our CI runs, the code will have to stand some deeper investigation. :)

OTOH, sometimes people want to do good things and just use the network to print things from twitter. We would kill them too if we forbid network usage completely.

fmaussion commented 6 years ago

You guys are bringing this quite far ;-)

I would forbid human interaction during the game via the rules, and that's it. If people really want to cheat, than let them cheat - it's their problem!

otizonaizit commented 6 years ago

I’d then suggest turning off the network as a solution that everybody can understand. ;)

But doing that means that you would have problem getting last-minute commits if needed or who knows what. Or forget to push the results of the tournament at the end. I think assuming to be able to have a machine without network is unrealistic.

Do we really need to script it then just to find out whether they are trying to cheat? We could also run all players in CI and if a team is much better during tournament than in our CI runs, the code will have to stand some deeper investigation. :)

Come on, this not realistic either. Would you really check if some clients perform much better during the tournament than in the CI? And how?

OTOH, sometimes people want to do good things and just use the network to print things from twitter. We would kill them too if we forbid network usage completely.

Exactly. They can dump twitter feeds if they want. My idea of the script is just a warning for us. If the script does detect network activity, we investigate.

This is an example dump from my console while running a pelita game on another terminal:

ss -tpla | grep pelita
LISTEN  0        100              127.0.0.1:40531               0.0.0.0:*        users:(("pelita",pid=13629,fd=25))
LISTEN  0        100              127.0.0.1:33011               0.0.0.0:*        users:(("pelita",pid=13629,fd=9))
LISTEN  0        100              127.0.0.1:46399               0.0.0.0:*        users:(("pelita",pid=13629,fd=16))
LISTEN  0        100              127.0.0.1:37127               0.0.0.0:*        users:(("pelita",pid=13629,fd=23))
ESTAB   0        0                127.0.0.1:33011             127.0.0.1:58076    users:(("pelita",pid=13629,fd=28))
ESTAB   0        0                127.0.0.1:46399             127.0.0.1:56886    users:(("pelita",pid=13629,fd=29))
ESTAB   0        0                127.0.0.1:40531             127.0.0.1:54388    users:(("pelita",pid=13629,fd=26))
ESTAB   0        0                127.0.0.1:37127             127.0.0.1:38576    users:(("pelita",pid=13629,fd=27))
otizonaizit commented 6 years ago

@fmaussion : yes, that would be my plan too. The script would be there just to help in the detection of problems.

Debilski commented 6 years ago

Exactly. They can dump twitter feeds if they want. My idea of the script is just a warning for us. If the script does detect network activity, we investigate.

if 'attack' in twitter_stream.filter('#aspp-code-strategy').head():
    state[bot.index]['mode'] = attack

;)

otizonaizit commented 6 years ago

That would be detected by the script!

On Thu 20 Sep, 13:28 +0000, Rike-Benjamin Schuppner notifications@github.com wrote:

Exactly. They can dump twitter feeds if they want. My idea of the script is just a warning for us. If the script does detect network activity, we investigate.

if 'attack' in twitter_stream.filter('#aspp-code-strategy').head(): state[bot.index]['mode'] = attack

;)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub¹, or mute the thread².☘

––––

¹ https://github.com/ASPP/pelita/issues/524#issuecomment-423182979 ² https://github.com/notifications/unsubscribe-auth/AAVMC_6E8J7jA0IU5rVWpVaqtardRT5hks5uc5gCgaJpZM4Wvhj-

Debilski commented 6 years ago

But your idea of a script would just run in a separate terminal while doing some tests and in a while True loop always grepping "pelita" for forbidden connections? Or do you imagine something more elaborate?

otizonaizit commented 6 years ago

Yes, something like

sudo watch ss -tpla

(no need for a while True loop, I think). And just grep for URL which are not 127.0.0.1 ...

On Thu 20 Sep, 06:50 -0700, Rike-Benjamin Schuppner notifications@github.com wrote:

But your idea of a script would just run in a separate terminal while doing some tests and in a while True loop always grepping "pelita" for forbidden connections? Or do you imagine something more elaborate?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub¹, or mute the thread².☘

––––

¹ https://github.com/ASPP/pelita/issues/524#issuecomment-423190532 ² https://github.com/notifications/unsubscribe-auth/AAVMC-aVIz0t5RJIjHhAGj7UuvK_ey6xks5uc50fgaJpZM4Wvhj-

jni commented 6 years ago

Come on guys, if a team is controlling their bot via Twitter hashtags, don't they deserve to win? =P

otizonaizit commented 6 years ago

Come on guys, if a team is controlling their bot via Twitter hashtags, don't they deserve to win? =P

No, honestly no. We have seen this for the first time in Camerino. They had this idea of controlling the bot remotely, with two control modes: changing strategy and manual override of single moves. They spent considerable time understanding the intricacies of network-programming, sockets, contexts, re-use of ports and addresses, synchronization issues, and a zillion other things. They did not succeed, in the sense that the code was more or less working, but then they had to realize that even with the remote control they didn't beat most of the others. They tried the remote control by twitter, by the way: it failed because it wasn't fast enough, and because twitter limits the amount of API requests in a given amount of time, so they pretty soon, while testing, were blocked (together with the rest of the groups, because all laptops shared the same public IP in a NAT).

The problem with this approach is that it is clever, but, first, it makes games and the tournament not replicable, and second, more important, it drains a lot of energy and time from the group project which would be better spent in trying to applying the SD techniques learned during the school. The new page https://python.g-node.org/wiki/pelita about the programming project explains this quite well, in my opinion.

So, to avoid further waste of time from other groups, I'd say that we should advertise a new rule that remote control is forbidden, and have a simple mechanism in place that helps us detect it and warn the students before they go down that route. It doesn't need to be perfect, it should just help us detect soon enough to avoid damage.

Debilski commented 6 years ago

I think the new API is just too easy to use if they end up having time for this. ;)

jni commented 6 years ago

@otizonaizit I was joking, but thanks for the rationale. One objection is that I don't see why futzing with twitter APIs is fundamentally different than futzing with NumPy APIs. It is still software development. But, I agree with the overall sentiment anyway.

How did you solve the replicability issue for the tournament videos?

Debilski commented 6 years ago

The tournament videos are recreated from JSON output not because we replay them.

Debilski commented 6 years ago

One objection is that I don't see why futzing with twitter APIs is fundamentally different than futzing with NumPy APIs.

One could argue that by looking at the main screen during the tournament and therefore being able to make decisions based on the unnoised enemy positions, the teams using human-created live input (no matter if Twitter or another homemade network solution) do have a qualitative advantage over those teams that rely just on what their move function receives as an input.

otizonaizit commented 6 years ago

Well, that group was there only one under time pressure frantically stackoverflowing their way into zmq connections issues. Other groups did not spend so much time on SO. I think this means that scientists are more used to numpy APIs than network-programming. And, as I said, Twitter API is rate-limited, so impossible to test under time pressure.

The tournament videos are generated from live screenshots (Rike's idea) taken during the tournament, so no problem.

fmaussion commented 6 years ago

Expression of the year for me: "stackoverflowing your way into something" :1st_place_medal: :)

Debilski commented 6 years ago

taken during the tournament

You wish. The screenshots are done days after the school with code from this branch ( https://github.com/ASPP/pelita/compare/master...Debilski:snapshots that has to be rebased and fine-tuned that it does not run too fast, of course ) and I have to sit there and rewatch it all for an hour, making sure that no window pops up that obscures the Tk frame. I’d happily implement a better solution but unless we change to a qt-based viewer (at least for generating the replay – but I don’t like that it would look differently then) I don’t see it really.

For reference, the conversion command to get a video from the pngs is ffmpeg -framerate 10 -i snapshot-%04d.png -start_number 1 -s:v 1280x720 -c:v libx264 -profile:v high -crf 20 -pix_fmt yuv420p video.mp4.

otizonaizit commented 6 years ago

But why not saving a png for every frame from tk and than use ffmpeg to create the video? We did that years ago and your complaint back then was that it was not elegant ...

Debilski commented 6 years ago

Wasn’t the thing that Tk could only do postscript output that turned out to be really ugly? All the other solutions I found do screenshotting as well, and I predict that this is not going to work well if it is done during a tournament. (Problems with the correct timing, repeated frames when starting/stopping, what to do when a frame is resized …)

jni commented 6 years ago

The question of how this year's tournament was re-created remains unanswered... =)

Debilski commented 6 years ago

Ok, here is the recipe:

  1. Run the tournament (or run pelita --dump for a single game dump)
  2. Once the tournament has been run, you will find a folder store-$LOCATION-$YEAR-$COUNT, eg. store-Camerino-2018-00 The $COUNT is only there to avoid ambiguities in case a new folder needs to be created. The folder contains the output of the tournament script (eg. the match tables) and for each match a json dump file. Ideally, you would push that complete folder to github to re-use it later.
  3. Each of the dump files contains exactly the same json that has been sent to the Tk viewer, separated by a "\x04" character (because that one is invalid in json and I don’t trust the newline character, which would also be invalid in proper json but would be used by json beautifiers and such).
  4. Clone the recent version of pelita
  5. Checkout (and rebase to master) the snapshots branch from https://github.com/ASPP/pelita/compare/master...Debilski:snapshots . This branch does the following: 4*. Usually, one would be able to just run a replay file with pelita --replay $DUMPFILE, unfortunately this is slightly broken right now in master so the branch needs to fix this (the part up to replay_publisher.run()). 4**. It adds a command line argument for the location to store the pngs and then some code in tk_canvas.py that executes the OS X screenshot command

    /usr/sbin/screencapture -R$topX,$topY,$width,$height $FILENAME

(and adds some sleeps for good measure so that the Tk window has properly refreshed)

  1. For all dump files, I then run pelita --replay $DUMPFILE --snapshot-dir $DUMPFOLDER
  2. Once all pngs have been created, they can be converted with the previously given command:

    ffmpeg -framerate 10 -i snapshot-%04d.png -start_number 1 -s:v 1280x720 -c:v libx264 -profile:v high -crf 20 -pix_fmt yuv420p video.mp4

  3. Repeat for all dumpfiles.
  4. The rest is then manually adding wiki syntax to the output file to include the videos.
otizonaizit commented 6 years ago
  1. Checkout (and rebase to master) the snapshots branch from master...Debilski:snapshots . This branch does the following: 4*. Usually, one would be able to just run a replay file with pelita --replay $DUMPFILE, unfortunately this is slightly broken right now in master so the branch needs to fix this (the part up to replay_publisher.run()).

I understand why you keep the Mac specific stuff on a separate branch and I hereby promise ( #531 ) I will explore the TK-dump-to-file features to get rid of this horror. But, please, what does it mean that --replay is slightly broken? Is there an issue open for this? Can this be fixed in master please?

Debilski commented 6 years ago

Yeah, the fix is badly needed but it should be done properly and not with thrown-in sleeps until it works. https://github.com/ASPP/pelita/issues/453