Action mask in Tictactoe

[ ] I have marked all applicable categories:
- [ ] exception-raising bug
- [ ] RL algorithm bug
- [x] documentation request (i.e. "X is missing from the documentation.")
- [ ] new feature request
[x] I have visited the source website
[x] I have searched through the issue tracker for duplicates
[ ] I have mentioned version numbers, operating system and environment, where applicable:
```
import tianshou, gymnasium as gym, torch, numpy, sys
print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)
```
Dear TSAIL group, Thanks for the easy-to-use and efficient library Tianshou for RL. Recently, when I was learning the MARL example, I found when using MultiAgentPolicyManager and Collector to interact with the Tictactoe environment, the invalid actions are masked. However, I didn't see any code for action masking. I only found in the observe function in the Tictactoe environment returning observation with action_mask. I was wondering how to perform action mask in a custom environment and the tianshou package. Any example can be provided? Thanks!

thu-ml / tianshou