microsoft / CyberBattleSim

An experimentation and research platform to investigate the interaction of automated agents in an abstract simulated network environments.
MIT License
1.66k stars 261 forks source link

Correction of benchmark results #87

Open kvas7andy opened 2 years ago

kvas7andy commented 2 years ago

Hi everyone,

Found several bugs while checking the code of ipynb notebooks with benchmark results for 3 environments TinyToy, ToyCTF, Chain.

I think my findings might be useful for community, who uses this nice implementation of cyberattacks simulation.

MOVED TO SEPARATE ISSUE #115

  1. Issue 1: learner.epsilon_greedy_search(...) is often used for training agents with different algorithms, including DQL in the dql_run. However dql_exploit_run with input network dql_run as policy-agent and eval_episode_count parameter for the number of episodes, gives an impression that runs are used for evaluation of the trained DQN. The only distinguishable difference between 2 runs is epsilon queal to 0, which leads to exploitation mode of training, but does not exclude training, because during run with learner.epsilon_greedy_search the optimizer.step() is executed on each step of training in the file agent_dql.py, function call learner.on_step(...).
  1. Issue 2: During training each episode ends only within the maximum number of iterations, which is due to the mistype in AttackerGoal class. Default value for parameter own_atleast_percent: float 1.0 is included as condition with AND, for raising flag done = True, thus for TinyToy and ToyCTF (not Chain) leads to long duration of training, wrong RL signal for evaluating Q function and low sample-efficiency.

MOVED TO SEPARATE ISSUE #115

  1. Issue 3: ToyCTF benchmark is inaccurate, because with correct evaluation procedure, like with chain network configuration, agent does not reqch goal of 6 owned nodes after 200 training episodes.
blumu commented 2 years ago

@kvas7andy Thanks for filing this issue with a detailed explanation. Could we split this into three separate issues to facilitate the discussion?

kvas7andy commented 2 years ago

Hi @blumu surely, lets split into three. Only thing is I will get back to discussion tomorrow.

blumu commented 1 year ago

@kvas7andy Is your commit above addressing all three problems mentioned in this issue or just some of them? (By the way, if you could split them as separate bugs that would be helpful.) Many thanks!

blumu commented 1 year ago

I moved Issue 1 to a separate issue #115