Closed vmicheli closed 3 years ago
Weird! I just tried your two commands on the master branch and get:
Oracle: [0/11|(0): take red hot pepper > go east > open screen door > cook red hot pepper with BBQ > go east > go north > take red apple from counter > cook red apple with stove > take knife from counter > slice red apple with knife > chop red hot pepper with knife > open fridge > take white onion from fridge > go south > go west > cook white onion with BBQ > chop white onion with knife > go east > go north > prepare meal > eat meal]
Edit: master is in sync with 1.4.3
I just tried the commands again and it works now. That's even weirder ahah.
Anyways it is time for some long-range language modeling, I'll let you know if I get interesting results!
Ok. Let me know if that happens again. Maybe there's some stochastic bug hidden in the oracle's trajectory computation! Also, as you can see the oracle assumes initial knowledge of the recipe (couldn't find a workaround yet).
Something I just thought while writing this is we could split the game in two:
Hmm, also something I just noted with the oracle's trajectory above. There's no examine cookbook
:(
At the moment I'm doing the data collection by playing the game myself with the assistance of the oracle. Hopefully only a few tens of demonstrations are necessary before moving on to RL.
But if we wanted to automate the data collection, then as you pointed out the agent would first need to find the cookbook (task 1), examine it and proceed (task 2).
Hey,
I'm unable to access oracle policy commands for tw-cooking games which were introduced a couple of months ago: https://github.com/microsoft/TextWorld/pull/261
I generated a game with:
tw-make tw-cooking --recipe 3 --take 3 --cook --cut --open --go 12 --split train --output tw_games/tw-game.z8 --seed 11985
and tried to play it with:
tw-play --hint tw_games/tw-game.z8
but oracle policy commands are not displayed.
Am I doing something wrong with the game generation or the play command?