tesslerc / malmo_rl

MIT License
2 stars 2 forks source link

Agent Walking Sideways to the right on command `move 1` #11

Closed Phantomb closed 6 years ago

Phantomb commented 6 years ago

Another thing I just noticed (specifically in my subskill_pickup domain) is that relatively often, the agent seems to strafe to the right, even when only issuing move 1 commands. This seems very strange. Might it have to do with Malmo not getting time to properly finish executing the command?

Have you encountered something like this?

tesslerc commented 6 years ago

I haven't seen this. Does this happen often, or did you observe it just once? I ask, since if you think it is due to the time Malmo is given to execute the command, you can try increasing the ms/tick value.

But I did not observe such behavior, this feels like some issue with the Malmo simulator / an issue with the agent not standing in the center of a block.

Phantomb commented 6 years ago

I saw this (at least) in one training session, where it would repeatedly walk sideways over many episodes. After restarting the malmo instance and starting a new training (so not continuing from a previous save) I have not seen it again on this machine. I am unsure if I have spotted this in other training sessions.

The thing that's worrying me a bit about this is the fact that I mostly train with a low visualization_frequency, so it might've happened more often without me knowing.

I also believe it's mostly a malmo simulator issue, so periodally restarting the host might be enough to prevent this. I was mostly curious if you might have encountered this too.

Phantomb commented 6 years ago

Just wanted to add, I notice this behaviour again now, on a different machine with a different OS (Linux this time), when training on a different domain. I suppose it must be malmo, but what causes it, I am unsure of. Maybe something to do with the setyaw at the start, or a turn not registering? I'm opening an issue on the malmo repo.

tesslerc commented 6 years ago

It could be, though I would expect Malmo to move the agent and not perform any logging as to where the simulator thinks the agent is. I really hope this isn't an issue with this repo, my main suggestion is maybe as you suggested to allow Malmo more time by increasing the ms/tick.