UPocek commented 1 year ago

Reinforcement learning for 3D Volleyball game

Team members:

Training agents to navigate successfully in 3D space and to play the game of volleyball. We will create the game ourselves using the Unity framework and the C# programming language, while the agent's logic itself will be written in the Python programming language. The environment in which the agents find themselves is a volleyball court in real-court proportions and a full physics simulation. The actions that are offered to the agent are moving in any direction, jumping, and spike, where the main goal of the project is for the agent to learn which combination of moves will lead him to win points and match. A point is won when the ball touches the opponent's field or when the opponent makes a mistake (kicks the ball out of the field or in the way it does not cross the net), and loses otherwise. We will create a game(environment), teach the agent to play the game with multiple different models on two machines in parallel, do the hyperparemeter tuning and evaluate which algorithm performed best for our problem and with which parameters.

2 algorithms one off-policy and one on-policy learning, including "Training with Proximal Policy Optimization (PPO)" and "Deep Q Learning (DQN)" which will be trained independently and which will eventually compete against each other to evaluate the methodologies and algorithms used for training

How much reward it receives while acting and learning, cumulative reward (the sum of all rewards received so far) as a function of the number of steps.

The number of points scored in relation to the total number of moves played with an active move that did not result solely from an opponent's mistake and the agent's performance in playing against a human.