njustesen / botbowl

A Blood Bowl AI framework.
http://www.bot-bowl.com
Other
128 stars 49 forks source link

Wrong info in Gym tutorial #239

Open njustesen opened 2 years ago

njustesen commented 2 years ago

I don't believe this is true in https://njustesen.github.io/botbowl/gym.html:

"The action space is discrete, the action is an int in the range 0 <= action_idx < len(action_mask)."

mrbermell commented 2 years ago

Can you elaborate? From what I can tell it's correct.

njustesen commented 2 years ago

If I can choose between blocking player A or player B, the sentence in the tutorial says that my action integer can be 0 or 1 but since it is a spatial action it has to be higher than the number of number of non-spatial actions to get past if action_idx < len(self.env_conf.simple_action_types): in _compute_action(self, action_idx: Optional[int], flip: Optional[bool] = None) -> List[Optional[Action]]:.

Instead, the integer is in the range [0, len(action_space)] which is implicit.

njustesen commented 2 years ago

Unless len(action_mask)=len(action_space) but that's just confusing, right?

mrbermell commented 2 years ago

Thanks for the clarification, I see your point and agree. We should explain how the action mask works here. I'll see what I can do!

mrbermell commented 2 years ago

How about something along these lines?

Action space

In botbowl's core engine all actions have a type, and some of the types also require a position. Read more about actions in the scripted bot tutorials. The gym environment has unrolled the spatial dimension into a one dimensional action space (see picture below). By doing so it becomes easy to use state-of-the-art algorithms, but it's worth considering that compared to many of the standard reinforcement learning benchmarks we have orders of magnitude larger action space.

The action of the environment in an integer, let's say action_idx = 352. You call env.step(action_idx) to step the environment with your action. But not all actions are legal at all times, this is where the action mask comes in. The action_mask is a vector of booleans that represents the legal actions, to check if your action action_idx is legal simply check if action_mask[action_idx] is true.

image

njustesen commented 2 years ago

This is better!

If the scripted bot tutorials contain important info about the action space, I think it should be included here. What are the paragraphs you are thinking of?