microsoft / jericho

A learning environment for man-made Interactive Fiction games.
GNU General Public License v2.0
253 stars 42 forks source link

Make FrotzEnv copiable #25

Closed MarcCote closed 4 years ago

MarcCote commented 4 years ago

This PR proposes a better way of saving the state of a game compared to save_str/load_str. With this PR one simply has to do

import jericho
env = jericho.FrotzEnv("zork1.z5")
obs, info = env.reset()

fork = env.copy()
obs, _, _, _ = fork.step("open mailbox")
print(obs)  # Opening the small mailbox reveals a leaflet.
obs, _, _, _ = fork.step("open mailbox")
print(obs)  # It is already open.

obs, _, _, _ = env.step("open mailbox")
print(obs)  # Opening the small mailbox reveals a leaflet.
MarcCote commented 4 years ago

@mhauskn see also the changes to find_valid_actions and identify_interactive_objects to make use of the .copy().

mhauskn commented 4 years ago

Thanks Marc, great idea!

I'd like to check to make sure that all the bits of emulator state are being captured by the RAM/PC/SP/Stack. In particular do you know if global variables of the game are captured by the RAM?

MarcCote commented 4 years ago

I agree with you, more testing is needed. I think they are saved on the stack which is also copied (or maybe it's only the local ones?) but I don't know how to guarantee it. I think the best way to test it would be to run all available walkthroughs until completion where we fork after each command. Any ideas?

mhauskn commented 4 years ago

I think that's a great idea - let me try it.

mhauskn commented 4 years ago

Just ran the suggested test with the Jericho games we have walkthroughs for to compare the trajectories with and without env.copy() at each step. Most of the games work as expected and the walkthrough is completed with env.copy() at each step.

However, games that rely on the RNG have issues. For example in Zork, they first differ when attacking the troll - in the env.copy() case, the troll doesn't die from the 1st attack. I think this indicates that the RNG is different when copy is employed. In particular if we can initialize the RNG of the copy to the current state of the RNG in the original, that may solve the issue.

import sys
from jericho import *

rom = '/home/matthew/workspace/text_agents/roms/zork1.z5'
bindings = load_bindings(rom)
env = FrotzEnv(rom, seed=bindings['seed'])
walkthrough = bindings['walkthrough'].split('/')
for act in walkthrough:
    # env = env.copy()
    obs, rew, done, info = env.step(act)
    print(act, obs)
MarcCote commented 4 years ago

Good catch. I'll investigate too.

MarcCote commented 4 years ago

I managed to save the state of the random generator (see my last commit). I didn't realize it was a bunch of static C variables, not Z-Machine variables!!! That means saving a game and reloading cannot save the random generator state (more on this below).

Here's my test function derived from yours:


def test_zork1():
    rom = '../zork1.z5'
    bindings = jericho.load_bindings(rom)
    env = jericho.FrotzEnv(rom, seed=bindings['seed'])
    env.reset()

    walkthrough = bindings['walkthrough'].split('/')
    expected = [env.step(act) for act in walkthrough]

    env.reset()
    for i, act in enumerate(walkthrough):
        obs, rew, done, info = env.step(act)

        if i + 1 < len(walkthrough):
            fork = env.copy()
            for j, cmd in enumerate(walkthrough[i+1:], start=i+1):
                obs, rew, done, info = fork.step(cmd)
                assert (obs, rew, done, info) == expected[j]

NB: It seems the same problem exists with save_str/load_str! I reach the troll fight and saved the state before loading it back and issuing the command "kill troll with sword". I did it 10 times, here are the different output strings:

Expected string:

The fatal blow strikes the troll square in the heart:  He dies.\nAlmost as soon as the troll breathes his last breath, a cloud of sinister black fog envelops him, and when the fog lifts, the carcass has disappeared.\nYour sword is no longer glowing.\n\n
It's curtains for the troll as your sword removes his head.
Almost as soon as the troll breathes his last breath, a cloud of sinister black fog envelops him, and when the fog lifts, the carcass has disappeared.
Your sword is no longer glowing.

The troll takes a fatal blow and slumps to the floor dead.
Almost as soon as the troll breathes his last breath, a cloud of sinister black fog envelops him, and when the fog lifts, the carcass has disappeared.
Your sword is no longer glowing.

The Troll Room
This is a small room with passages to the east and south and a forbidding hole leading west. Bloodstains and deep scratches (perhaps made by an axe) mar the walls.
A nasty-looking troll, brandishing a bloody axe, blocks all passages out of the room.
Your sword has begun to glow very brightly.
The axe sweeps past as you jump aside.

The Troll Room
This is a small room with passages to the east and south and a forbidding hole leading west. Bloodstains and deep scratches (perhaps made by an axe) mar the walls.
A nasty-looking troll, brandishing a bloody axe, blocks all passages out of the room.
Your sword has begun to glow very brightly.
The axe sweeps past as you jump aside.

The troll is knocked out!

Clang! Crash! The troll parries.
The troll's axe barely misses your ear.

You charge, but the troll jumps nimbly aside.
The troll's mighty blow drops you to your knees.

The Troll Room
This is a small room with passages to the east and south and a forbidding hole leading west. Bloodstains and deep scratches (perhaps made by an axe) mar the walls.
A nasty-looking troll, brandishing a bloody axe, blocks all passages out of the room.
Your sword has begun to glow very brightly.
The troll's swing almost knocks you over as you barely parry in time.

A furious exchange, and the troll is knocked out!
MarcCote commented 4 years ago

@mhauskn I fixed the rng issue and some left-over memleaks (valgrind seems happy now).

mhauskn commented 4 years ago

Awesome job with the RNG and memory leaks! Correctness tests are now passing for all the games.

However - it seems like a bit of a performance slowdown compared to save/load for tasks like finding valid actions.

The following code takes 2mins49secs using the new copy-based version of find_valid_actions. But only 5secs using the older save/load version of find_valid_actions.

import sys
from jericho import *
from jericho.template_action_generator import TemplateActionGenerator
from tqdm import trange

rom = '/home/matthew/workspace/text_agents/roms/zork1.z5'
bindings = load_bindings(rom)
env = FrotzEnv(rom, seed=bindings['seed'])
walkthrough = bindings['walkthrough'].split('/')
act_gen = TemplateActionGenerator(bindings)
interactive_objs = [obj[0] for obj in env.identify_interactive_objects(use_object_tree=True)]
candidate_actions = act_gen.generate_actions(interactive_objs)
for _ in trange(100):
    valid = env.find_valid_actions(candidate_actions)
MarcCote commented 4 years ago

@mhauskn good catch. Forking/copy() has a lot of overhead (i.e. loading the shared library). You can try with this last commit?

mhauskn commented 4 years ago

It is working at faster speed than the older save/load and it also has the benefit of being to save in places where the previous save/load would not work. Awesome stuff Marc!

Do you think there is any use for keeping load/save functionality or we should just move to the get/set state?

MarcCote commented 4 years ago

I personally wouldn't use save/save_str and load/load_str as they don't keep the RNG state (see above, my simple experiment about that).

mhauskn commented 4 years ago

I totally agree. I'm tempted to outright remove the older load/save functions because I think the newer set/get state is better in every way. The only concern would be backwards compatibility. What do you think?

On Mon, Jan 13, 2020 at 11:29 AM Marc-Alexandre Côté < notifications@github.com> wrote:

I personally wouldn't use save/save_str and load/load_str as they don't keep the RNG state (see above, my simple experiment about that).

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/microsoft/jericho/pull/25?email_source=notifications&email_token=AAAE2OLUEGSGLYJN67BINFTQ5S6H3A5CNFSM4KE375L2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIZ7HMA#issuecomment-573830064, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAE2OIJOFXIV52MI6NXUCLQ5S6H3ANCNFSM4KE375LQ .

MarcCote commented 4 years ago

I guess the best way would be to issue a deprecation warning saying "Deprecated: use at your own risk! You should use get/set_state instead.". Then in a couple of releases, we can simply drop them.

mhauskn commented 4 years ago

Awesome, I've deprecated the load,save and updated the docs to point to the new methods: https://jericho-py.readthedocs.io/en/latest/tutorial_quick.html#loading-and-saving

On Mon, Jan 13, 2020 at 12:37 PM Marc-Alexandre Côté < notifications@github.com> wrote:

I guess the best way would be to issue a deprecation warning ( https://docs.python.org/3/library/exceptions.html#DeprecationWarning) saying "Deprecated: use at your own risk! You should use get/set_state instead.". Then in a couple of releases, we can simply drop them.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/microsoft/jericho/pull/25?email_source=notifications&email_token=AAAE2OONE4FOFB6B6CCVETDQ5TGKDA5CNFSM4KE375L2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI2GIGQ#issuecomment-573858842, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAE2OIH6QXCBWTJFAPXUCTQ5TGKDANCNFSM4KE375LQ .

mhauskn commented 4 years ago

I've also bumped version to 2.3.0 and uploaded to Pypi. Thanks again for the great PR!

On Mon, Jan 13, 2020 at 2:22 PM Matthew Hausknecht < matthew.hausknecht@gmail.com> wrote:

Awesome, I've deprecated the load,save and updated the docs to point to the new methods: https://jericho-py.readthedocs.io/en/latest/tutorial_quick.html#loading-and-saving

On Mon, Jan 13, 2020 at 12:37 PM Marc-Alexandre Côté < notifications@github.com> wrote:

I guess the best way would be to issue a deprecation warning ( https://docs.python.org/3/library/exceptions.html#DeprecationWarning) saying "Deprecated: use at your own risk! You should use get/set_state instead.". Then in a couple of releases, we can simply drop them.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/microsoft/jericho/pull/25?email_source=notifications&email_token=AAAE2OONE4FOFB6B6CCVETDQ5TGKDA5CNFSM4KE375L2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI2GIGQ#issuecomment-573858842, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAE2OIH6QXCBWTJFAPXUCTQ5TGKDANCNFSM4KE375LQ .