jerry871002 / bsi-pt

BSI-PT algorithm in the paper "Opponent Exploitation Based on Bayesian Strategy Inference and Policy Tracking"
https://jerry871002.github.io/bsi-pt/
0 stars 0 forks source link

Baseball game test is failing #51

Closed jerry871002 closed 1 year ago

jerry871002 commented 1 year ago

https://github.com/jerry871002/bayesian-strategy-inference/actions/runs/5841852816/job/15842391174

+ python run.py baseball bpr-okr -n 5 --new-phi-opponent -q 3
----- (bpr-okr agent, New Phi opponent) q = 3 -----
Traceback (most recent call last):
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/run.py", line 195, in <module>
    run(args)
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/run.py", line 49, in run
    run_bpr_okr(args)
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/baseball_game/run.py", line 355, in run_bpr_okr
    policy_preds.append(step_5_policy_preds[i])
IndexError: list index out of range

https://github.com/jerry871002/bayesian-strategy-inference/actions/runs/5857902074/job/15880785779?pr=56

+ python run.py baseball bsi-pt -n 5 -e 2

----- (bsi-pt agent) Test random switch opponent -----
Traceback (most recent call last):
  File "run.py", line 195, in <module>
    run(args)
  File "run.py", line 53, in run
    run_bsi_pt(args)
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/baseball_game/run.py", line 648, in run_bsi_pt
    policy_preds.append(step_2_policy_preds[i])
IndexError: list index out of range
jerry871002 commented 1 year ago

The cause of this issue is this piece of code that is executed after each episode

https://github.com/jerry871002/bayesian-strategy-inference/blob/6c9084ae8e3ab8e2f09122c4f1e824a907ffaf88/src/baseball_game/run.py#L323-L355

The problem was that if the intra-belief happens to be uniformly distributed (which is quite unlikely, so the error does not happen every time), the code enters the first if block, and the step_i_policy_preds arrays aren't properly "aligned", causing the next episode having an IndexError. (i increased but the length of the array wasn't)