Open MarcCote opened 2 years ago
@mhauskn this PR is really messed at the moment. I'm going to clean it up, but I wanted you to have an update on the fix. I now believe after this PR, we might want to do a major release (Jericho 4.0.0).
Things to look at, changes in the frotz_interface.c
(assume all commented code will be deleted), and all the games c files.
Also, I would like you to try out the script that you used to validate world changed when playing the walkthroughs.
I'm getting the following error trying to install:
src/interface/frotz_interface.c:27:10: fatal error: md5.h: No such file or directory
#include "md5.h"
^~~~~~~
compilation terminated.
I think we used to use the md5.c file to provide standalone md5 hashing, without the need for other dependencies. Is the PR intending to include a md5.h file?
My bad. I pushed the missing file. That will enable computing the md5 hash for a game state directly in C instead of having to get the state from Python, then call the hashing function from hashlib
(i..e, less data moving around).
Hey Marc - I've done some small fixes around the get_cleaned_world_diff() - see them here: https://github.com/microsoft/jericho/commit/d59a6c33f26e52392c7df88d792959d18d3ee266.
The motivation for these changes is to have a consistent world_diff returned by jericho._get_world_diff() and jericho._filter_candidate_actions(). This change makes both return a 256-length tuple that is comparable.
However, I'm detecting some issues with Zork1's walkthrough:
Both of these seem to be issues with world_state not changing when we expect it should.
Here is the script I'm running to detect these issues: https://gist.github.com/mhauskn/861e4983f54a435013f66e9ab44ea308#file-test_walkthrough-py
Oh, I should have mentioned I have yet to test the valid_action code. I was focusing on making sure the commands from the walkthroughs appropriately trigger world_changed
.
Also, I was thinking of removing the whole world_diff, since now I keep a copy of the previous objects tree that is used to compare against the current objects tree. The valid_action code still needs to be changed to use that.
For Zork1, I have yet to validate it. I'm currently doing planetfall.z3
(see list above). If you want to help, you can start from the bottom and work your way up. Here's my workflow:
test_games.py game.z5 --check --no-precheck
(as in my original post above).world_changed
is not triggered when it should have been, run the tools/find_special_ram game.z5 --index no_command
which will detect all ram locations that have been changed by that command as well as listing all commands that have changed each detected ram location throughout the walkthrough.game.c
file or add an exception to SKIP_CHECK_STATE
for that command in test_games.py
.@mhauskn I'm back working on it. I've done the initial validation for a few more games.
Also, I've left a couple of comments (see above) to show some bug fixes I found. Those should probably go in a separate PR to be merged in the current version of Jericho (i.e., v3).
However, I'm detecting some issues with Zork1's walkthrough:
- Step 162. NoWorldChange gold_act: Press yellow button, obs: Click.
- Step 165. NoWorldChange gold_act: Turn bolt with wrench, obs: The sluice gates open and water pours through the dam.
Now detected correctly (see ab86cb5).
Hey Marc wanted to touch base on this PR. Is it still in-progress or in a finished state? If the latter - I will be happy to run some checks on it.
I finally have some time now to tidy it up. One thing that is still missing is to check whether no-op action (e.g., inventory, examine obj, look) doesn't change the state of the environment (e.g., thief is moving around).
I tested for false-positive world changes by attempting a combination of valid and invalid actions. Invalid actions included things like "z/wait/inventory/x me/navigating-into-a-dead-end" etc. Below are my findings:
curses, ballyhoo, gold, hhgg, inhumane, lostpig, omniquest, pentari, planetfall, tryst205, zork3, ztuu - all actions are triggering world changed.
huntdark - after going "left", all actions started triggering world changes (maybe there is a timer for bleeding out). Doesn't seem to happen if going "right". moonlit.z5 - many of the failed navigation actions are triggering false-positive world changes. weapon - going south results in false-positive world changed.
Also a handful of games are giving false-positive world changes the first time "inventory" command is issued, but not on subsequent invokations.
@mhauskn what should we do with timers? For instance, in curses.z5, there seems to be a timer of 5 steps on 'antiquated wireless' before the radio plays random music.
Should we clean them when clean=True in the get_objects function?
Obj111: antiquated wireless Parent107 Sibling15 Child0 Attributes [13, 19, 21] Properties {25: '54 f5', 23: '00 05', 20: 'c2 62', 19: '55 1d', 18: 'c2 63', 3: '54 db', 2: '54 93', 1: '9c 75 78 ea 93 ea a1 e8'}
Obj111: antiquated wireless Parent107 Sibling15 Child0 Attributes [13, 19, 21] Properties {25: '54 f5', 23: '00 04', 20: 'c2 62', 19: '55 1d', 18: 'c2 63', 3: '54 db', 2: '54 93', 1: '9c 75 78 ea 93 ea a1 e8'}
Note property 23 going from 5 to 4.
@MarcCote I'm not familiar with the radio in curses. When does the timer start - when the game starts or when you encounter the radio?
Generally speaking, clean=True was designed to give world state hashes that were more probable to get cache hits. For example, when clean=True we remove the Zork1's thief from the object tree as otherwise world states that were otherwise identical would not give cache hits due to different locations of the thief.
If we remove the timer when clean=True, then we run the risk of mixing the states before and after the radio is playing random music. What consequence does this have for the game? Are there different valid actions that can only be applied before the radio start playing random music but not after?
If we keep the timer, then the largest problem will be with every action appearing valid (as the timer will decrement no matter what action is taken). This is a pretty significant problem, albeit one that lasts only for 5 steps.
All-in-all I'd lean towards removing the timer when clean=True, provided the timer only starts after you encounter the radio and there isn't a significant difference in the valid actions for the before/after timer case.
You made very good points. I agree with the overall rationale behind clean=True
. Something we can do actually is instead of zeroing out the properties, we can first check their value. For instance, detect changes from 1->0
but not 5->4
, 4->3
, 3->2
, 2->1
. What do you think of that?
Only detecting the change from 1->0 would improve the cache hit rate. Do you know if this timer has significance for story progression? If not then, probably best to ignore/remove it altogether. If so, then let's do the 1->0 approach.
You are right. It doesn't impact the story progression. I actually had to zero a bunch of internal counters! But I manage to go through curses.z5. I'll go over the other games next.
I tested for false-positive world changes by attempting a combination of valid and invalid actions. Invalid actions included things like "z/wait/inventory/x me/navigating-into-a-dead-end" etc. Below are my findings:
curses, ballyhoo, gold, hhgg, inhumane, lostpig, omniquest, pentari, planetfall, tryst205, zork3, ztuu - all actions are triggering world changed.
huntdark - after going "left", all actions started triggering world changes (maybe there is a timer for bleeding out). Doesn't seem to happen if going "right". moonlit.z5 - many of the failed navigation actions are triggering false-positive world changes. weapon - going south results in false-positive world changed.
Also a handful of games are giving false-positive world changes the first time "inventory" command is issued, but not on subsequent invokations.
If you have time, could you redo a quick check on some of the games since I fixed a bug that was yielding a lot of false-positives. I don't expect to be good for other games than curses.z5 for now (I'll need to do the same treatment I did for curses.z5).
@MarcCote I've done some tests curses.z5 is looking too me! Your fixes seem to have improved some of the other games as well.
This PR improves support for the games in Jericho.
Here are the main modifications/improvements of this PR.
[x] Updating the objects count that was off from some games. Having the wrong number of objects leads to an incomplete object tree which is used to represent the game state, hence to track
world_changed
. Furthermore, this affects valid actions detection which uses the game state to know whether an action had any effect in the world.[x] Instead of relying on
world_diff
,world_changed
now relies onRetPC
)victory() > 0
game_over() > 0
.[x] Re-validating all the games to make sure that relevant commands from the walkthrough lead to a
world_changed
.[ ] (in progress) Re-validating all the games to make sure that "noop" commands (e.g.,
look
,inventory
,examine OBJ
) do not triggerworld_changed
.Bonus additions:
curses.z5
max score is actually 554 instead of 550 (due to a bug in the game, see commentcurses.c
)murdac.z5
will report 250/250 upon victory (even though the RAM says 249).Here's the current progress of re-validating the games: