microsoft / jericho

A learning environment for man-made Interactive Fiction games.
GNU General Public License v2.0
253 stars 42 forks source link

Max score not achieved with walkthrough #60

Closed IssamLaradji closed 1 year ago

IssamLaradji commented 2 years ago

Hi team, I had the agent go through the walkthrough for zork1.z5, and the scores don't add up to the maximum score and sometimes decrease along the way. The image below shows how the score changes over time

We also found out that the games where trajectories achieve the max_score are,

['905.z5', 'acorncourt.z5', 'adventureland.z5', 'afflicted.z8', 'awaken.z5',
 'ballyhoo.z3', 'detective.z5', 'dragon.z5', 'enter.z5', 'gold.z5',
 'huntdark.z5', 'infidel.z3', 'inhumane.z5', 'jewel.z5', 'library.z5',
 'loose.z5', 'ludicorp.z5', 'moonlit.z5', 'night.z5', 'omniquest.z5',
 'plundered.z3', 'reverb.z5', 'seastalker.z3', 'snacktime.z8', 'temple.z5',
 'weapon.z5', 'wishbringer.z3', 'zenon.z5']

whereas trajectories NOT achieving the max_score are

['zork1.z5', 'advent.z5', 'anchor.z8', 'balances.z5', 'curses.z5',
 'cutthroat.z3', 'deephome.z5', 'enchanter.z3', 'hhgg.z3', 'hollywood.z3',
 'karn.z5', 'lostpig.z8', 'lurking.z3', 'murdac.z5', 'partyfoul.z8',
 'pentari.z5', 'planetfall.z3', 'sherlock.z5', 'sorcerer.z3', 'spellbrkr.z3',
 'spirit.z5', 'theatre.z5', 'trinity.z4', 'tryst205.z5', 'yomomma.z8',
 'zork2.z5', 'zork3.z5', 'ztuu.z5']

Why do you think this is happening?

image
MarcCote commented 2 years ago

Thanks for reporting this. Can you post the code that loads the games and play through the walkthrough?

MarcCote commented 2 years ago

Also, note that some games do have incomplete walkthrough (see issue #43). You can run this script to test the walkthroughs python tools/test_games.py ./roms/*

./roms/905.z5           PASS
./roms/acorncourt.z5    PASS
./roms/advent.z5        PASS
./roms/adventureland.z5 PASS
./roms/afflicted.z8     PASS
./roms/anchor.z8        FAIL    Done but score 99/100
./roms/awaken.z5        PASS
./roms/balances.z5      FAIL    Done but score 50/51
./roms/ballyhoo.z3      PASS
./roms/curses.z5        FAIL    Done but score 450/550
./roms/cutthroat.z3     PASS
./roms/deephome.z5      PASS
./roms/detective.z5     PASS
./roms/dragon.z5        PASS
./roms/enchanter.z3     PASS
./roms/enter.z5         PASS
./roms/gold.z5          PASS
./roms/hhgg.z3          PASS
./roms/hollywood.z3     PASS
./roms/huntdark.z5      PASS
./roms/infidel.z3       PASS
./roms/inhumane.z5      PASS
./roms/jewel.z5         PASS
./roms/karn.z5          PASS
./roms/library.z5       PASS
./roms/loose.z5         PASS
./roms/lostpig.z8       FAIL    Done but score 6/7
./roms/ludicorp.z5      PASS
./roms/lurking.z3       PASS
./roms/moonlit.z5       PASS
./roms/murdac.z5        FAIL    Done but score 249/250
./roms/night.z5         PASS
./roms/omniquest.z5     PASS
./roms/partyfoul.z8     PASS
./roms/pentari.z5       PASS
./roms/planetfall.z3    PASS
./roms/plundered.z3     PASS
./roms/reverb.z5        PASS
./roms/seastalker.z3    PASS
./roms/sherlock.z5      PASS
./roms/snacktime.z8     PASS
./roms/sorcerer.z3      PASS
./roms/spellbrkr.z3     PASS
./roms/spirit.z5        PASS
./roms/temple.z5        PASS
./roms/theatre.z5       FAIL    Done but score 47/50
./roms/trinity.z4       PASS
./roms/tryst205.z5      PASS
./roms/weapon.z5        PASS
./roms/wishbringer.z3   PASS
./roms/yomomma.z8       FAIL    Done but score 34/35
./roms/zenon.z5         PASS
./roms/zork1.z5         PASS
./roms/zork2.z5         PASS
./roms/zork3.z5         PASS
./roms/ztuu.z5          PASS
mhauskn commented 2 years ago

Hi @IssamLaradji many of the games that you list as not achieving max score are also those that are stochastic (as defined in Table 2 in https://arxiv.org/pdf/1909.05398.pdf). These stochastic games are dependent on the environment's seed being set correctly when following the walkthrough. Could you try creating a small script that follows (https://jericho-py.readthedocs.io/en/latest/tutorial_quick.html#walkthroughs) and check for a single game (e.g. Zork) that you can get the desired walkthrough score?

MarcCote commented 1 year ago

@IssamLaradji can we close this issue or is there something else we can help you with?