Getting the same walkthrough from .get_walkthrough() regardless of seed number

ktr0921 commented 3 years ago

I keep getting the same walkthrough from .get_walkthrough() regardless of seed number. Is it normal? Then, what does seed do in Jericho? It seems like observation is the same for different seed as well under the same walkthrough. Below is my code

seed = 1

env = FrotzEnv(dir_rom)
env.seed(seed=seed)
obs, _ = env.reset()
get_walkthrough = env.get_walkthrough()

I am using jericho 3.0.4. jericho==3.0.4

Additionally, there is no argument use_walkthrough_seed in FrotzEnv.reset()

mhauskn commented 3 years ago

I'm afraid that Jericho only "knows" a single walkthrough for each game so env.get_walkthrough() will always return the same sequence of actions for a particular game.

Whether or not you'll get the same sequence of observations when running those actions depends on the game. For games that have stochastic transitions (e.g. randomness) your results from executing the walkthrough will vary depending on the seed you set. For games that are deterministic - the walkthrough results will be the same regardless of seed. Take a look at Table 2 to see which games are stochastic vs. deterministic.

When resetting Jericho, if you call seed() with no arguments, it will default to using the seed that reproduces a successful walkthrough. Note that you'll need to call reset() before the seed will take effect.

Play Zork with walkthrough seed:

env = FrotzEnv(dir_rom) # Load's the walkthrough's seed by default here.
env.seed() # We don't actually need this call, but it doesn't hurt.
obs, _ = env.reset()
get_walkthrough = env.get_walkthrough()

Play Zork with a different seed:

env = FrotzEnv(dir_rom) # Load's the walkthrough's seed by default here.
env.seed(42) # Sets a new seed.
obs, _ = env.reset() # New seed takes effect
get_walkthrough = env.get_walkthrough() # Walkthrough actions will be same, but different results when executed.

ktr0921 commented 3 years ago

So,

for a game (like 905 as Table 2 shows) that uses deterministic transitions, the walkthrough is guaranteed to the completion of the game regardless of seed.
for a game (like Zork as Table 2 shows) that uses stochastic transitions, the walkthrough is guaranteed to the completion of the game if seed corresponding to the walkthrough is used.
for a game (like Zork as Table 2 shows) that uses stochastic transitions, the walkthrough may not lead to the completion of the game if seed corresponding to the walkthrough is not used.

So, I guess the best way of getting walkthrough of different kinds than .get_walkthrough() is just deploying RL agent?

mhauskn commented 3 years ago

You're correct on all 3 bullet points.

Current RL agents have trouble completing most games - so it's unlikely they'll discover new walkthroughs for you (if they do you could probably write a very good paper about it). More broadly it depends on how you define "different walkthroughs" - it's theoretically possible to re-order the sequence of solving quests / collecting items in any of the existing walkthroughs to get a somewhat trivally different walkthrough.

If you're interested in trying to extract observations from a game (ala #39) - one approach is to sample different states from the walkthrough and explore locally around each state - for example by attempting one or more valid-actions. This will broaden the set of observations to a local neighborhood around the walkthrough states.

ktr0921 commented 3 years ago

Thank you for your reply. I will try and see how it goes.

microsoft / jericho

Getting the same walkthrough from .get_walkthrough() regardless of seed number #41