WahomeKezia / Demowith_issues

0 stars 0 forks source link

Another one #3

Open WahomeKezia opened 1 year ago

WahomeKezia commented 1 year ago

We were trying to plot the rewards after each episode

This are some the ideas we run ,

First , Created a another file in the directory ICLR23Workshop, visualize.py

In this file we run this codes to visualize the dnq function ,

This code could iterated 10000 times ( We did not wait for it to finish )

The code was

import matplotlib.pyplot as plt

# importing the function we are running from collectdata 
from collectdata import test_dqn

# creating a list to store the reward per episode 
#Total episodes are 10000 
rewards_per_episode = []
total_rewards = 0

# the loop that will run and save the rewards in the list
for episode in range(1, 10001):
    #run the test_dqn function and get the reward
    reward = test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")

    #add the reward to the total_rewards
    total_rewards += reward

    # if the episode number is a multiple of 500, append the average reward per episode to the rewards_per_episode list and reset the total_rewards
    if episode % 500 == 0:
        rewards_per_episode.append(total_rewards/500)
        total_rewards = 0

# declaring the range for plotting 
episodes = range(500, 10001, 500)

#plotting the rewards_per_episode vs. episodes
plt.plot(episodes, rewards_per_episode)
plt.xlabel('Episodes')
plt.ylabel('Reward')
plt.title('Reward per Episode')
plt.show()

this was the code ,


# Importing the function test_dqn from collectdata module which we will be running
from collectdata import test_dqn

rewards = []   # Initializing an empty list to store rewards obtained in each episode

# Looping over 5 episodes
for episode in range(1, 6):
    # Calling the test_dqn function with the given URL as argument and storing the returned reward value
    reward = test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")
    # Appending the reward value to the rewards list
    rewards.append(reward)

episodes = [1, 2, 3, 4, 5]   # Initializing a list of episode numbers

# Plotting a line graph with episode numbers on x-axis and rewards on y-axis
plt.plot(episodes, rewards)
plt.xlabel('Episodes')   # Labeling the x-axis as 'Episodes'
plt.ylabel('Reward')   # Labeling the y-axis as 'Reward'
plt.title('Reward per Episode')   # Setting the title of the plot
plt.show()   # Displaying the plot on the screen

It was plotting out the graph but the logic felt wrong since we are taking the 10000 episodes as actions making up one episode ,(from our understanding)

    rewards_per_episode = []
    while not done:
        a = model.choose_action(s)
        s, r, done, truncated, info = env.step(a)
        episode_reward+=r
        rewards_per_episode.append(episode_reward)

and this was the visualize.py file

# importing matplotlip
import matplotlib.pyplot as plt
# importing collectdata
import collectdata 

num_episodes = 10000

collectdata.test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")

# Plot rewards per episode
plt.plot(range(num_episodes), collectdata.rewards_per_episode)
plt.xlabel('Episodes')
plt.ylabel('Reward')
plt.title('Reward per Episode')
plt.show()

This gave us a very interesting output , a list of 10 rewards only ,

And cloud not plot the visualization because of the value error here :

ValueError: x and y must have same first dimension, but have shapes (10000,) and (10,)

@ what can you advice on the actions and episodes the is running