microsoft / jericho

A learning environment for man-made Interactive Fiction games.
GNU General Public License v2.0
253 stars 42 forks source link

[WIP] FIX: Incomplete objects tree for some games #54

Open MarcCote opened 2 years ago

MarcCote commented 2 years ago

This PR improves support for the games in Jericho.

Here are the main modifications/improvements of this PR.

Bonus additions:

Here's the current progress of re-validating the games:

python tools/test_games.py ./roms/* --check

Game Status Noop check
905.z5 OK Done
acorncourt.z5 OK Done
advent.z5 OK Done
adventureland.z5 OK Done
afflicted.z8 OK Done
anchor.z8 OK Done but score 99/100 Done
awaken.z5 OK Done
balances.z5 FAIL Done but score 50/51 Done
ballyhoo.z3 OK Done
curses.z5 OK Done
cutthroat.z3 OK Done
deephome.z5 OK Done
detective.z5 OK Done
dragon.z5 OK Done
enchanter.z3 OK Done
enter.z5 OK Done
gold.z5 OK Done
hhgg.z3 OK Done
hollywood.z3 OK Done
huntdark.z5 OK Done
infidel.z3 OK Done
inhumane.z5 OK Done
jewel.z5 OK Done
karn.z5 OK Done
library.z5 OK Done
loose.z5 OK Done
lostpig.z8 OK Done
ludicorp.z5 OK Done
lurking.z3 OK Done
moonlit.z5 OK Done
murdac.z5 OK Done
night.z5 OK Done
omniquest.z5 OK Done
partyfoul.z8 OK Done
pentari.z5 OK
planetfall.z3 OK
plundered.z3 OK
reverb.z5 OK
seastalker.z3 OK
sherlock.z5 OK
snacktime.z8 OK
sorcerer.z3 OK
spellbrkr.z3 OK
spirit.z5 OK
temple.z5 OK
theatre.z5 OK
trinity.z4 OK
tryst205.z5 OK
weapon.z5 OK
wishbringer.z3 OK
yomomma.z8 OK
zenon.z5 OK
zork1.z5 OK
zork2.z5 OK
zork3.z5 OK
ztuu.z5 OK
MarcCote commented 2 years ago

@mhauskn this PR is really messed at the moment. I'm going to clean it up, but I wanted you to have an update on the fix. I now believe after this PR, we might want to do a major release (Jericho 4.0.0).

Things to look at, changes in the frotz_interface.c (assume all commented code will be deleted), and all the games c files.

Also, I would like you to try out the script that you used to validate world changed when playing the walkthroughs.

mhauskn commented 2 years ago

I'm getting the following error trying to install:

src/interface/frotz_interface.c:27:10: fatal error: md5.h: No such file or directory
 #include "md5.h"
          ^~~~~~~
compilation terminated.

I think we used to use the md5.c file to provide standalone md5 hashing, without the need for other dependencies. Is the PR intending to include a md5.h file?

MarcCote commented 2 years ago

My bad. I pushed the missing file. That will enable computing the md5 hash for a game state directly in C instead of having to get the state from Python, then call the hashing function from hashlib (i..e, less data moving around).

mhauskn commented 2 years ago

Hey Marc - I've done some small fixes around the get_cleaned_world_diff() - see them here: https://github.com/microsoft/jericho/commit/d59a6c33f26e52392c7df88d792959d18d3ee266.

The motivation for these changes is to have a consistent world_diff returned by jericho._get_world_diff() and jericho._filter_candidate_actions(). This change makes both return a 256-length tuple that is comparable.

However, I'm detecting some issues with Zork1's walkthrough:

Both of these seem to be issues with world_state not changing when we expect it should.

Here is the script I'm running to detect these issues: https://gist.github.com/mhauskn/861e4983f54a435013f66e9ab44ea308#file-test_walkthrough-py

MarcCote commented 2 years ago

Oh, I should have mentioned I have yet to test the valid_action code. I was focusing on making sure the commands from the walkthroughs appropriately trigger world_changed.

Also, I was thinking of removing the whole world_diff, since now I keep a copy of the previous objects tree that is used to compare against the current objects tree. The valid_action code still needs to be changed to use that.

For Zork1, I have yet to validate it. I'm currently doing planetfall.z3 (see list above). If you want to help, you can start from the bottom and work your way up. Here's my workflow:

  1. Start with test_games.py game.z5 --check --no-precheck (as in my original post above).
  2. If for some commands the world_changed is not triggered when it should have been, run the tools/find_special_ram game.z5 --index no_command which will detect all ram locations that have been changed by that command as well as listing all commands that have changed each detected ram location throughout the walkthrough.
  3. Add the special address to the relevant game.c file or add an exception to SKIP_CHECK_STATE for that command in test_games.py.
MarcCote commented 2 years ago

@mhauskn I'm back working on it. I've done the initial validation for a few more games.

Also, I've left a couple of comments (see above) to show some bug fixes I found. Those should probably go in a separate PR to be merged in the current version of Jericho (i.e., v3).

MarcCote commented 2 years ago

However, I'm detecting some issues with Zork1's walkthrough:

  • Step 162. NoWorldChange gold_act: Press yellow button, obs: Click.
  • Step 165. NoWorldChange gold_act: Turn bolt with wrench, obs: The sluice gates open and water pours through the dam.

Now detected correctly (see ab86cb5).

mhauskn commented 2 years ago

Hey Marc wanted to touch base on this PR. Is it still in-progress or in a finished state? If the latter - I will be happy to run some checks on it.

MarcCote commented 2 years ago

I finally have some time now to tidy it up. One thing that is still missing is to check whether no-op action (e.g., inventory, examine obj, look) doesn't change the state of the environment (e.g., thief is moving around).

mhauskn commented 2 years ago

I tested for false-positive world changes by attempting a combination of valid and invalid actions. Invalid actions included things like "z/wait/inventory/x me/navigating-into-a-dead-end" etc. Below are my findings:

curses, ballyhoo, gold, hhgg, inhumane, lostpig, omniquest, pentari, planetfall, tryst205, zork3, ztuu - all actions are triggering world changed.

huntdark - after going "left", all actions started triggering world changes (maybe there is a timer for bleeding out). Doesn't seem to happen if going "right". moonlit.z5 - many of the failed navigation actions are triggering false-positive world changes. weapon - going south results in false-positive world changed.

Also a handful of games are giving false-positive world changes the first time "inventory" command is issued, but not on subsequent invokations.

MarcCote commented 2 years ago

@mhauskn what should we do with timers? For instance, in curses.z5, there seems to be a timer of 5 steps on 'antiquated wireless' before the radio plays random music.

Should we clean them when clean=True in the get_objects function?

Obj111: antiquated wireless Parent107 Sibling15 Child0 Attributes [13, 19, 21] Properties {25: '54 f5', 23: '00 05', 20: 'c2 62', 19: '55 1d', 18: 'c2 63', 3: '54 db', 2: '54 93', 1: '9c 75 78 ea 93 ea a1 e8'}
Obj111: antiquated wireless Parent107 Sibling15 Child0 Attributes [13, 19, 21] Properties {25: '54 f5', 23: '00 04', 20: 'c2 62', 19: '55 1d', 18: 'c2 63', 3: '54 db', 2: '54 93', 1: '9c 75 78 ea 93 ea a1 e8'}

Note property 23 going from 5 to 4.

mhauskn commented 2 years ago

@MarcCote I'm not familiar with the radio in curses. When does the timer start - when the game starts or when you encounter the radio?

Generally speaking, clean=True was designed to give world state hashes that were more probable to get cache hits. For example, when clean=True we remove the Zork1's thief from the object tree as otherwise world states that were otherwise identical would not give cache hits due to different locations of the thief.

If we remove the timer when clean=True, then we run the risk of mixing the states before and after the radio is playing random music. What consequence does this have for the game? Are there different valid actions that can only be applied before the radio start playing random music but not after?

If we keep the timer, then the largest problem will be with every action appearing valid (as the timer will decrement no matter what action is taken). This is a pretty significant problem, albeit one that lasts only for 5 steps.

All-in-all I'd lean towards removing the timer when clean=True, provided the timer only starts after you encounter the radio and there isn't a significant difference in the valid actions for the before/after timer case.

MarcCote commented 2 years ago

You made very good points. I agree with the overall rationale behind clean=True. Something we can do actually is instead of zeroing out the properties, we can first check their value. For instance, detect changes from 1->0 but not 5->4, 4->3, 3->2, 2->1. What do you think of that?

mhauskn commented 2 years ago

Only detecting the change from 1->0 would improve the cache hit rate. Do you know if this timer has significance for story progression? If not then, probably best to ignore/remove it altogether. If so, then let's do the 1->0 approach.

MarcCote commented 2 years ago

You are right. It doesn't impact the story progression. I actually had to zero a bunch of internal counters! But I manage to go through curses.z5. I'll go over the other games next.

MarcCote commented 2 years ago

I tested for false-positive world changes by attempting a combination of valid and invalid actions. Invalid actions included things like "z/wait/inventory/x me/navigating-into-a-dead-end" etc. Below are my findings:

curses, ballyhoo, gold, hhgg, inhumane, lostpig, omniquest, pentari, planetfall, tryst205, zork3, ztuu - all actions are triggering world changed.

huntdark - after going "left", all actions started triggering world changes (maybe there is a timer for bleeding out). Doesn't seem to happen if going "right". moonlit.z5 - many of the failed navigation actions are triggering false-positive world changes. weapon - going south results in false-positive world changed.

Also a handful of games are giving false-positive world changes the first time "inventory" command is issued, but not on subsequent invokations.

If you have time, could you redo a quick check on some of the games since I fixed a bug that was yielding a lot of false-positives. I don't expect to be good for other games than curses.z5 for now (I'll need to do the same treatment I did for curses.z5).

mhauskn commented 2 years ago

@MarcCote I've done some tests curses.z5 is looking too me! Your fixes seem to have improved some of the other games as well.