oxwhirl / smac

SMAC: The StarCraft Multi-Agent Challenge
MIT License
1.04k stars 227 forks source link

graphing smac_run_data.json data - 3s5z data does not match graph from paper #24

Closed GJHall closed 4 years ago

GJHall commented 4 years ago

I have not been able to reproduce the graphs based on the .json file available. This is my first time working with json file types so I may be missing something obvious.

In the code, I read in smac_run_data.json and pull out the 5 runs based on a specific map and method. In this case, I use "QMIX" and "3s5z". To get the mean, I use np.mean on the 5 runs for their respective time steps. I then calculate the standard deviation using np.std and multiply by 2 to find my bounds for shading. I am using the 2*std as a ~95% confidence interval.

How did you graph smac_run_data.json?

Here is the code I wrote to graph it.

import pandas as pd
import numpy as np
df = pd.read_json('smac_run_data.json', orient='columns')
# display maps and algorithms available
# print(df.head(10))

# select map and algorithm
row = 'QMIX'
column = '3s5z'
df1 = df.loc[row,column]
df1 = pd.DataFrame(df1)
# select run and mean type
df2 = df1.iloc[[0],[0]]
# other runs can be selected using: 
    # df2 = df1.iloc[[Run_1,Run_2,Run_3,Run_4,Run_5],[test_battle_won_mean, test_return_mean]]
# convert into numpy
df3=pd.DataFrame(df2).to_numpy()
# select the data from the list
df4 = df3[0,0]
# convert the data into numpy workable format
# not sure why this is needed to be done twice
df5=np.asarray(df4)
# select the win ratio for the selected run
df6=df5[:,1]
# pick up the time step indexes
times = df5[:,0]
# create dummy vector to initialize 
length_test=df6.shape
zero_hold = np.zeros(length_test)

for i in range(0,5):
    df2 = df1.iloc[[i],[0]]
    # other runs can be selected using: 
        # df2 = df1.iloc[[Run_1,Run_2,Run_3,Run_4,Run_5],[test_battle_won_mean, test_return_mean]]
    # convert into numpy
    df3=pd.DataFrame(df2).to_numpy()
    # select the data from the list
    df4 = df3[0,0]
    # convert the data into numpy workable format
    # not sure why this is needed to be done twice
    df5=np.asarray(df4)
    # select the win ratio for the selected run
    df6=df5[:,1]

    length_test=df6.shape

    zero_hold = np.vstack((zero_hold,df6))

# delete the zeros place holder array
battle_5_runs = zero_hold[1::]

# calculate standard deviation and 
battle_std=np.std(battle_5_runs,axis=0)*2 # using 2 standard deviation to get relatively close to 95% confidence
battle_mean=np.mean(battle_5_runs,axis=0)
lower_std=battle_mean+battle_std
upper_std=battle_mean-battle_std
np.max(times)

# https://htmlcolorcodes.com/
import matplotlib.pyplot as plt

plt.plot(times, battle_mean, color='#f5b041')
# plt.plot(times, battle_5_runs[0], color='blue') # sanity checking that runs have data
plt.axis([0, np.max(times), 0, 1])
plt.fill_between(times, lower_std, upper_std, facecolor='#f5b041', alpha=0.3)
plt.ticklabel_format(style='sci', axis='x', scilimits=(0,0))
plt.xlabel('T')
plt.ylabel('Test Win Rate')

plt.show()

edit: adding graphs for 3s5z using QMIX this is the one I generated: 3s5z

this is the one listed in the QMIX publication (Figure 6): qmix_fig_6

Gezx commented 4 years ago

Do you get the smac_run_data.json yourself? I don't get the smac_run_data.json from the code. What kind of structure is it?

GJHall commented 4 years ago

Thank you for the reply Gezx.

No, I did not run any experiments to generate the graphs. I could not find any code in the repository to generate the graphs so I wrote my own code to extract the data from the provided json file.

I download the data from: [(https://github.com/oxwhirl/smac/releases/download/v1/smac_run_data.json)]

Default .json structure I am using: x_map IQL: test_battle_won_mean{Run_1,Run_2,Run_3,Run_4,Run_5:[time_step,win_ratio]},test_return_mean{Run_1,Run_2,Run_3,Run_4,Run_5:[time_step,mean_return]} VDN: test_battle_won_mean{Run_1,Run_2,Run_3,Run_4,Run_5:[time_step,win_ratio]},test_return_mean{Run_1,Run_2,Run_3,Run_4,Run_5:[time_step,mean_return]} QMIX: test_battle_won_mean{Run_1,Run_2,Run_3,Run_4,Run_5:[time_step,win_ratio]},test_return_mean{Run_1,Run_2,Run_3,Run_4,Run_5:[time_step,mean_return]} COMA: test_battle_won_mean{Run_1,Run_2,Run_3,Run_4,Run_5:[time_step,win_ratio]},test_return_mean{Run_1,Run_2,Run_3,Run_4,Run_5:[time_step,mean_return]} QTRAN: test_battle_won_mean{Run_1,Run_2,Run_3,Run_4,Run_5:[time_step,win_ratio]},test_return_mean{Run_1,Run_2,Run_3,Run_4,Run_5:[time_step,mean_return]}

I obtain my graph from using the code I provide in the original comment on the json data I download from the smac repository. Hopefully this answers your question more fully.

tabzraz commented 4 years ago

In the SMAC paper I graphed the median and the 25-75% interval. Each run logs at slightly different timesteps, so you might need to do a small amount of linear interpolation between points in order to get the median (or mean) for the same timestep across runs (scipy has a function for this).

The data you're graphing is from the SMAC paper (https://arxiv.org/abs/1902.04043) which has a slightly different starcraft setup compared to the original QMIX paper (which that figure is from), and we also use a slightly different architecture for the mixing network that leads to much better results.

GJHall commented 4 years ago

Yes, this makes sense.

I have been using Mendeley to archive papers I am reading so I missed the jump from February to December when y'all updated SMAC to version 1.0.

February 2019: feb_26_qmix

December 2019: dec_19_qmix

That is quite a jump in performance. Nice work.

I concur the December results match the .json file provided.

Thank you for your help and your time.