ASPP / pelita

Actor-based Toolkit for Interactive Language Education in Python
https://github.com/ASPP/pelita_template
Other
63 stars 68 forks source link

Get rid of non-fatal errors #832

Open otizonaizit opened 2 weeks ago

otizonaizit commented 2 weeks ago

At the moment a bot can generate different kind of "errors":

  1. it returns an illegal position
  2. it gets a timeout
  3. it raises an Exception

The code (and the rules) are complicated by the fact that we treat these errors differently. Errors of type 1 and 2 result in a random move and are counted. After 5 errors of this type the game is over. Errors of type 3 result in an immediate game over.

After many years seeing bots implemented by students, I think that it would be more useful to handle errors of type 1 as fatal like type 3: immediate game over. This helps in debugging, because you get immediate feedback when you return an illegal position, instead of having to fish out the warning printed by pelita among the other thousand lines printed by your bot in the development phase. Errors of type 1 are easy to avoid (check next_pos in bot.legal_positions). This I think is an uncontroversial change that would simplify pelita code and the rules that we need to explain quite a bit.

Regarding errors of type 2 I think that they also should be made fatal. It almost never happens that bots generate timeouts, except the network bots – but this is a different issue.

If all errors would be made fatal we would have easier rules to explain, less clutter in the UI (because we don't have to show the error counts) and help students to catch errors earlier so that they write more robust bots.

Let's start a discussion!

Debilski commented 2 weeks ago

Open to making wrong moves fatal, but I think one reason we have it is because of the timeout handling. The idea being that after a timeout the bot will still attempt to send the move and then there is a small danger that the reply gets mixed up with the next, regular reply from the bot. We just play it safe here. This should not happen but I’m not sure if we have tests for it.

I’m more strongly against making timeouts fatal. If you have a bot that does a lot of analyses and caching in the first round, then the time that it takes can be very much dependent on the layout. You can hit 2.5 seconds regularly but then in the tournament you get the one maze that takes 3.2 seconds.

Pretty sure we would have seen spurious timeouts in the last tournament, had we run it on one of the students laptops and not on mine, which happens to be two to three times faster.

otizonaizit commented 1 week ago

And what if we make the timeouts never fatal, i.e. we stop counting them?

I am aiming at a simplification of the rules and of the code. Also, in case of timeout, instead of a random move, we could just let the bot keep its position. The fact that we execute a random move is one of the last few things that make games (and the tournament, see #833 ) strictly non-reproducible. A timeout can be due to the host being busy with other things.

So, my proposal: we make illegal positions fatal and allow timeouts. In case a bot gets a timeout, it simply stays where it is. No fatal error after N timeouts.

It is a super rare situation anyway: I don't remember ever having seen a bot being disqualified because of too many timeouts. And, in a real game, if a bot is timing out all the time it will very soon lose the game anyway, so I don't see the need to count the timeouts and make a 5 of them fatal.

What do you think? Wouldn't this simplify the code too?

Debilski commented 1 week ago

If we have a timeout and the bot stops then that is just as reproducible as having it execute a random move that has been seeded. 🤷

As said in #833 I also don’t consider the fact that the tournament may be unpredictable to be an issue at all. I plan to repair #833 as this half-solution is also not a good state, but when I do, I will explicitly add an option to use random.SystemRandom. I don’t want to be tried for match fixing. :)

Allowing timeouts may lead to teams abusing the situation either on purpose or because there is no real harm done in writing slow code. I’d say, the reason that we have never seen a team losing on timeouts is because it is punished.

As for the code: The whole error handling needs to be rewritten and documented, no matter what we decide here, and it might be a good idea to do this before making any changes to game.py.

otizonaizit commented 1 week ago

Well, getting a timeout is not necessarily reproducible. So in one game started with the very same seed it could be that one time you get the timeout (and currently draw a random number from the pool). If the next time with the same seed you don't get the timeout, that random number is not drawn and stays in the pool. That's why I think that staying still makes it more reproducible: that pool of random numbers is not touched

Debilski commented 1 week ago

Ah, now I see what you are worried about. But remember that bots have their own rng that is untouched by this. So the pool of random numbers is only used for the penalty random move, for relocating the food and for the noiser. If you think it is necessary, we could also have separate rngs here for these things. However, I also think that, yes, because timeouts are unpredictable, also relocating the food and noised values will be practically unpredictable after a timeout event, as all this is going to change under the influence of that one bot who had the timeout and could not do what it wanted (and of that other team that is reacting to it). In the end it does not matter if we had a random move or not, the final result is going to look different. And even without food lifetimes, without random moves, without noise the end result is going to look different all things considered, simply because the bots will behave ever so slightly different towards each other.

If you think about it, the whole time measuring business is a complete mess. To do it properly, any bot with a significant amount of potentially time-consuming logic will implement some time measuring to decide whether to break some loop early or not. The time they measure might be different in each run, no matter the state of the rng. They will never have a timeout in the game but their random number generator will be run a different amount of times each time (unless they don’t use the rng, of course).

It’s great, of course, that we can make games against the simple demo bots predictable reproducible and have this reproducibility also available for the student’s first own bots, but in the event of timeouts, with certain network bots or with anything that uses external cues (like time measuring), we won’t be able to guarantee this. And I do not think this is a problem.

What we should have eventually, I think, is the history viewer (or have a better logging/replay mechanism that stores the whole game – or the last N games), and have an option to print the current state of any location in history and give teams the possibility to run their move function against this state. Maybe it is not necessary to play the whole game again but just being able to watch it again (and replay is only needed for that one move in round 200).