microsoft / jericho

A learning environment for man-made Interactive Fiction games.
GNU General Public License v2.0
261 stars 43 forks source link

Interactive objs #32

Closed mhauskn closed 4 years ago

mhauskn commented 4 years ago

This PR streamlines the main Jericho interface:

mhauskn commented 4 years ago

@MarcCote I'd love it if you could take a quick look at this PR and make sure the new Jericho interface looks reasonable. Hopefully none of the previous functionality was lost in the streamlining process.

mhauskn commented 4 years ago

The main idea behind making score private is because it is returned in the info object after every call to step or reset.

victory game_over are a little more complicated and most RL is typically concerned with the done flag and the amount of score - rather than understanding if the game was won or lost.

I'm curious about the use case in TW?

MarcCote commented 4 years ago

TextWorld users can request such information to appear in the info dict. I prefer giving as much information I can and let the user decide what they need. Also, I think victory and game_over are underused in the RL community :P.

This is where it is used in TW (note how I assumed methods returning useful information start with a get_ prefix): https://github.com/microsoft/TextWorld/blob/master/textworld/envs/zmachine/jericho.py#L50-L66

For the AP, I think it is fine to keep the get_ methods public even if they are 1) underused, 2) part of the info object.

While on the topic, maybe find_valid_actions could be renamed to start with a get_ :P.

mhauskn commented 4 years ago

Good points - I've updated to make victory, score, and game_over public and rename find_valid_actions to get_valid_actions.

mhauskn commented 4 years ago

Thanks again Marc, really appreciate your feedback!

MarcCote commented 4 years ago

@mhauskn tell me when you are about to merge this PR. I'll test it against current version of TextWorld.

mhauskn commented 4 years ago

I think it's ready for merge. Please let me know how the tests work out.

MarcCote commented 4 years ago

I think there are still a few references to LGOP in the code (I think you decided to remove it, right?).

MarcCote commented 4 years ago

Also, I've seen the framework is tracking some special RAM addresses. That's nice. Did you check if get_state and set_state need to be modified accordingly?

mhauskn commented 4 years ago

I tried to remove all references to LGOP. I'll do another check for any remaining.

Yes - fortunately the way the step loop is structured, it updates the special ram at the start of step, then looks for differences at the end of step. So thankfully I believe its not necessary to modify the state representation at all. (It may be possible make the special ram address into local variables inside of step, but this may require additional mallocs inside of step, which I'd prefer to avoid.)

MarcCote commented 4 years ago

Regarding LGOP, maybe my local version is out-of-sync. Just in case, here are the locations mentioning LGOP image

MarcCote commented 4 years ago

By the way. All of TextWorld's tests are passing :+1:

mhauskn commented 4 years ago

Great! Going to break for lunch, then try to give the seed() method you suggested a try.

MarcCote commented 4 years ago

You can start from PR #34

mhauskn commented 4 years ago

@MarcCote I changed seed() to not reset the environment, and added a note that reset() must be called for seed to take effect. Hope this is alright.

MarcCote commented 4 years ago

Yes. Totally.