techandy42 / torchgym

A PyTorch library that provides major RL algorithms and functionalities for training OpenAI Gym agents.
1 stars 0 forks source link
openai-gym pytorch reinforcement-learning

torchgym

logo

A PyTorch library that provides major RL algorithms and functionalities for training OpenAI Gym agents.

demo

About

Installation

Structure

Training Model

dqn_train Function Specification

# The default hyperparameters are optimized for the MountainCar-v0 environment. 
def dqn_train(
  env_name: str, # a valid OpenAI gym environment name from Classical Control or Box-2d
  num_episodes: int, # the number of training episodes 
  episode_length: int, # (default: 10000) the number of steps in each episode, set this to the End of Episode number specified at the OpenAI Gymnasium website
  learning_rate: float, # (default: 1e-3) the rate at which the neural network is updated
  gamma: float, # (default: 0.995) the discount rate
  exploration_rate: float, # (default: 0.1) the probability of the model choosing random action during training 
  capacity: int, # (default: 8000) the number of experiences stored before starting training the model
  batch_size: int, # (default: 256) the batch size of the training data
  net_layers: int[], # (default: [100]) the specification of the hidden neural network shape for the Actor Network
  optimizer_label: str, # (default: 'Adam') the name of the optimizer used, doesn't affect the training
  optimizer_callback: None | callback, # (default: None) more on this below
  loss_func_label: str, # (default: 'MSELoss') the name of the loss function used, doesn't affect the training
  loss_func_callback: None | callback, # more on this below
  model_label: None | str, # (default: None) the name of the model, doesn't affect the training
  saved_model_id: None | str, # (default: None) more on this later
  callbacks: str[], # (default: []) if 'record', 'plot', or 'save_on_max_reward' are included, these callbacks are called during training; 'record' creates a video of the model, 'plot' creates a plot of loss/num steps/reward, and 'save_on_max_reward' saves the model weight at the point of maximum reward during training
):
  ...

Training Example


from torchgym.dqn.train import dqn_train

model_id = dqn_train(env_name='MountainCar-v0', num_episodes=1000, episode_length=200, model_label='model1', callbacks=['record', 'plot', 'save_on_max_reward'])


> Evaluating
- Evaluate the model's performance.
```py
from torchgym.dqn.eval import dqn_eval

if model_id is not None:
  dqn_eval(env_name='MountainCar-v0', saved_model_id=model_id) # swap out the env_name to the correct environment that the model was trained on

Recording

  • Record the model running.
    
    from torchgym.dqn.record import dqn_record

if model_id is not None: dqn_record(env_name='MountainCar-v0', saved_model_id=model_id) # swap out the env_name to the correct environment that the model was trained on


### Optimizer Callback

- You can create custom optimizer functions with callbacks.

> Example
```py
import torch.optim as optim

optimizer_callback = lambda net_parameters, learning_rate: optim.SGD(net_parameters, lr=learning_rate)

model_id = dqn_train(env_name='MountainCar-v0', num_episodes=1000, episode_length=200, model_label='model1', optimizer_label='SGD', optimizer_callback=optimizer_callback, callbacks=['record', 'plot', 'save_on_max_reward'])

Loss Function Callback

Example


import torch.nn as nn

def l1_loss_func_callback(target_v, v, state, action, reward, next_state, normalized_reward): loss_func = nn.L1Loss() return loss_func(target_v, v)

model_id = dqn_train(env_name='MountainCar-v0', num_episodes=1000, episode_length=200, model_label='model1', loss_func_label='l1', loss_func_callback=loss_func_callback, callbacks=['record', 'plot', 'save_on_max_reward'])


### Training Existing Models

- You can continue training existing models from the `history` directory by specifying its model ID, and the new model with extended training will be stored in a new directory inside the `history/<environment_name>` directory (the original model data will not be altered).

> Example
```py
new_model_id = dqn_train(env_name='MountainCar-v0', num_episodes=1000, episode_length=200, model_label='model1', callbacks=['record', 'plot', 'save_on_max_reward'], saved_model_id=model_id)

History

Import Helper Function Module

from torchgym.functions.history import save_history, upload_history

Download to Local Computer

save_history()

Download to Google Drive

save_history(action='drive')

Upload zipped History to Colab

upload_history()

More

Note