ML-Agents automatically logs some metrics during evaluation, such as cumulative reward. These can be viewed in TensorBoard.
Custom Metrics
For more specific evaluation metrics (e.g., win rate, goals scored), you'll need to implement custom logging:
a. Modify your Unity environment to track these metrics.
b. Use the Unity SideChannel to send this data to Python.
c. In your Python script, receive this data and log it using TensorBoard or a custom logging solution.
Example of logging a custom metric:
import numpy as np
from mlagents_envs.environment import UnityEnvironment
from mlagents_envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
from mlagents_envs.side_channel.stats_side_channel import StatsSideChannel
def evaluate_soccer_twos(env_path, num_episodes=100):
# Set up channels
engine_configuration_channel = EngineConfigurationChannel()
stats_channel = StatsSideChannel()
# Create and use the environment
env = UnityEnvironment(
file_name=env_path,
side_channels=[engine_configuration_channel, stats_channel],
no_graphics=True,
worker_id=0
)
# Configure the environment
engine_configuration_channel.set_configuration_parameters(time_scale=20.0)
env.reset()
behavior_name = list(env.behavior_specs)[0] # Get the behavior name
# Initialize metrics
wins = 0
total_goals_scored = 0
total_goals_conceded = 0
total_possession_time = 0
successful_passes = 0
total_passes = 0
for _ in range(num_episodes):
# Simulate the environment and track metrics
_, reward, done, _ = env.step(0) # Replace with your own logic to simulate the environment
if reward == 1: # Replace with your own logic to detect a win
wins += 1
total_goals_scored += reward # Replace with your own logic to calculate goals scored
total_goals_conceded += -reward # Replace with your own logic to calculate goals conceded
total_possession_time += env.reset()[3] # Replace with your own logic to track possession time
successful_passes += 1 # Replace with your own logic to count successful passes
total_passes += env.step(0)[8] # Replace with your own logic to count total passes
# Log the results
print(f"Win Rate: {wins / num_episodes * 100}%")
print(f"Average Goals Scored: {total_goals_scored / num_episodes}")
print(f"Average Goals Conceded: {total_goals_conceded / num_episodes}")
print(f"Average Possession Time: {total_possession_time / num_episodes} seconds")
print(f"Pass Accuracy: {successful_passes / total_passes * 100}%")
if __name__ == "__main__":
env_path = "path/to/your/SoccerTwos_build" # Update this with your actual build path
evaluate_soccer_twos(env_path, num_episodes=100)
Analyzing Evaluation Results
Quantitative Analysis:
Win rate: Percentage of matches won.
Goals scored/conceded: Average per match.
Possession time: How long the agent controls the ball.
Pass accuracy: Successful passes vs. total attempts.
Qualitative Analysis:
Watch gameplay videos to assess strategy and behavior.
Look for emergent behaviors or unexpected strategies.
Comparative Analysis:
Compare performance against baseline models or previous versions.
Evaluate against different opponent strategies.
Iterative Improvement
Based on evaluation results:
Identify weaknesses in the model's performance.
Adjust training parameters or reward structure.
Retrain the model with improvements.
Re-evaluate to measure the impact of changes.
Remember, evaluation is an iterative process. You may need to go through several cycles of training, evaluation, and adjustment to achieve the desired performance in the Soccer Twos environment.
Evaluating ML-Agents Soccer Twos Model
Purpose of Evaluation
Evaluation serves to:
Setting Up Evaluation Matches
Disable Training Mode
When evaluating, you want to run the model in inference mode (no learning occurs). This is done using the
--no-train
flag:Use Multiple Environment Instances
To get more robust results, it's often beneficial to run multiple instances of the environment simultaneously:
This runs 10 simultaneous matches, providing more data points for evaluation.
Disable Graphics (Optional)
For faster evaluation, especially when running multiple instances, you can disable graphics rendering:
Logging Evaluation Results
Built-in Metrics
ML-Agents automatically logs some metrics during evaluation, such as cumulative reward. These can be viewed in TensorBoard.
Custom Metrics
For more specific evaluation metrics (e.g., win rate, goals scored), you'll need to implement custom logging:
a. Modify your Unity environment to track these metrics. b. Use the Unity
SideChannel
to send this data to Python. c. In your Python script, receive this data and log it using TensorBoard or a custom logging solution.Example of logging a custom metric:
Analyzing Evaluation Results
Quantitative Analysis:
Qualitative Analysis:
Comparative Analysis:
Iterative Improvement
Based on evaluation results:
Remember, evaluation is an iterative process. You may need to go through several cycles of training, evaluation, and adjustment to achieve the desired performance in the Soccer Twos environment.